
JOURNAL OF 



Theory 


editor.: Karl Shell 
associate editors: 


A. B. Atkinson 
Robcif J. Aumann 
William A. Brock 
Donald J. Brown 
David Cass 
Paul Champsaur 
Gerard Debreu 
Duncan K. Foley 
David Gale 
Steven M. Goldman 
Jean-Michel Grandmont 
Frank H. Hahn 
Oliver D. Hart 
Walter Perrin Heller 
Werner Hildenbrand 


Jean J askold-Gabszewicz 
Leif Johansen 
Mordecai Kurz 
Hayne Leland 
Robert E. Lucas, Jr. 
Edmond Malinvaud 
Andreu Mas-Colell 
Prasanta K. Pattanaik 
Roy Radnor 
Stephen A. Ross 
Michael Rothschild 
Jos6 A. Scheinkman 
Dieter Sondermann 
David A. Starrett 


assistant editor: VirginiaT. Glauser 


Volume 13, 1976 



ACADEMIC PRESS 

New York and London 


A Subsidiaiy of Harcourt Brace Jovanovich, Publishers 


Copyright © 1976 by Academic Press, Inc. 

All Rights Reserved 

No part of thia publication may be reproduced or transmitted in any form or by any 
means, electronic or mechanical, including photocopy, recording, or any information 
storage and retrieval system, without permission in writing from the copyright owner. 


Printed by the St. Catherine 


Pres> ' Lti ' Bruges, Belgium 



Contents of Volume 13 


/ 

Number 1, August 1976 


Elon Kohlberg. A Model of Economic Growth with Altruism ■ 
between Generations.. 1 

William V. Gehrlein and Peter C. Fishburn. The Probability of 

the Paradox of Voting: A Computable Solution ....... 14 


Mukul Majumdar, Tapan Mitra, and Daniel McE^pden. On 
Efficiency and Pareto Optimality of Competitive Programs in 
Closed Multisector Models. 

Tapan Mitra and Mukul Majumdar. A Note on the Role of the 
Transversality Condition in Signalling Capital Overaccumula¬ 
tion . 47 

Trout Rader. Equivalence of Consumer Surplus, the Divisia 

Index of Output, and Eisenberg’s Addilog Social Utility ... 58 

Carol A. Taylor. The Precautionary Demand for Money: A 


Utility Maximization Approach. 67 

Serge-Christophe Kolm. Unequallnequalities.il. 82 

John Roberts and Hugo Sonnenschein. On the Existence of 
Cournot Equilibrium without Concave Profit Functions ... 112 

Jacques Lesourne. The Optimal Growth of the Firm in a Growing 

Environment. 118 

Jerry S. Kelly. Rights Exercising and a Pareto-Consistent Liber¬ 
tarian Claim.138 

Notes, Comments, and Letters to the Editor 

Bruce L. Miller. The Effect on Optimal Consumption of 
Increased Uncertainty in Labor Income in the Multiperiod 
Case. 154 


Number 2, October 1976 

Lars E. O. Svensson. Sequences of Temporary Equilibria, 

Stationary Point Expectations, and Pareto Efficiency .... 169 

MICHAEL Maschler. An Advantage of the Bargaining Set over 

the Core. 184 













Bengt Hansson and Henrik Sahlquist. A Proof Technique for 

Social Choice with Variable Electorate.193 

G6RARD Fuchs. Asymptotic Stability of Stationary Temporary 
Equilibria and Changes in Expectations.201 

PEIM Gardentors. Manipulation of Social Choice Functions . 217 

Robert J. Barro. indexation in a Rational Expectations Model . 229 

Michael J. P. Magill and George M. Constantinides. Port¬ 
folio Selection with Transactions Costs.245 

Michael J. P. Magill. The Preferability of Investment Through a 

Mutual Fund.^64 


Formed Habits 

1. Robert A. Pollak. Habit Formation and Long-Run 

Utility Functions.272 

2. Ahmad E. El-Safty. Adaptive Behavior, Demand and 

Preferences.298 

3. Ahmad E. El-Safty. Adaptive Behavior and the Existence 

of Weizsacker’s Long-Run Indifference Curves.319 

4. Peter J. Hammond. Endogenous Tastes and Stable Long- 

Run Choice.329 


Number 3, December 1976 


Stephen A. Ross. The Arbitrage Theory of Capital Asset Pricing 341 

Douglas H. Blair, Georges Bordes, Jerry S. Kelly, and Kotaro 
Suzumura. Impossibility Theorems without Collective 
Rationality. 361 


R. Manning. Issues in Optimal Educational Policy in the Context 
of Balanced Growth.380 


John Hartwick, Urs Schweizer, and Pravin Varaiya. Com¬ 
parative Statics of a Residential Economy with Several Classes 


Thomas Schwartz, Choice Functions, “Rationality” Conditions, 
and Variations on the Weak Axiom of Revealed Preference . 


E. 


Dierker, C. Fourgeaud, and W. Neuefeind. 
Returns to Scale and Productive Systems 


Increasing 


Errol Glustoff. Differential 
Solutions to Maximization 


Properties of Functions which are 
or Minimization Problems 


396 

414 

428 

439 














Notes, Comments, and Letters to the Editor 

Joop Hartog. Age-Income Profiles, Income Distribution 

and Transition Proportions. 448 

Paul Champsaur and Guy Laroque. A Note on the Core of 

Economies with Atoms or Syndicates.458 

Richard S. Toikka. The Welfare Implications of Becker’s 

Discrimination Coefficient.472 

Conway L. Lackman. A Household Consumption Model of 

Solid Waste.478 

Stanislaw Gomulka. Technological Condition for Balanced 

Growth: A Note on Professor Whitaker’s Contribution . 484 
Hal R. Varian. On the History of Concepts of Fairness . . 486 


Announcement and Call for Papers .488 

Author Index for Volume 13.489 










JOURNAL OF EORY 

Volume 13, Number 1, August 1976 


Copyright © 1976 by Academic Press, Inc. 
All Rights Reserved 


No part of this publication may be reproduced or transmitted in any form or by any 
means, electronic or mechanical, including photocopy, recording, or any information 
storage and retrieval system, without permission in writing from the copyright owner. 


Published bimonthly at 37 Tempelhof, Bruges, Belgium 
by Academic Press, Inc., Ill Fifth Avenue, New York, N.Y. 10003 
1976: Volumes 12-13. Price per volume: S39.50 U.S.A.; 

$42.50 outside U.S.A. (plus postage). 

Information concerning personal subscription rates may be obtained 
by writing to the Publisher. 

All correspondence and subscription orders should be addressed to the office of the 
Publishers at 111 Fifth Avenue, New York, N.Y. 10003. 

Send notices of change of address to the office of the Publishers at least 
6-8 weeks in advance. Please include both old and new addresses. 

Second class postage paid at Jamaica, N.Y. 

Air freight and mailing in the U.S.A. by Publications Expediting, Inc., 

200 Meacham Avenue, Elmont, New York 11003. 

Copyright © 1976 by Academic Press; Inc. 

Printed in Bruges, Belgium, by the St. Catherine Press, Ltd. 




JOURNAL OF ECONOMIC THBORY 13, 1-13 (1976) 


A Model of Economic Growth 
with Altruism between Generations* 

Elon Kohlberg 

Harvard University, 1737 Cambridge Street, Room 404, 
Cambridge, Massachusetts 02138 

Received June 12, 1975; revised September 22, 1975 


Introduction 

The aim of this paper is to analyze the growth of an economy ftVera 
period spanning many generations when the basic interests of different 
generations conflict. 

The analysis is concerned with a simple model of economic growth. 
In this model, one generation lives each period, consuming a portion of 
the capital stock bequeathed to it and investing the remainder in 
production. The output of the production process is bequeathed to the 
next generation. 

Of course, if each generation cares only about its own consumption, it 
will consume all that is bequeathed to it, and there will be no growth. 
Given, however, a sufficient degree of altruism between generations, the 
economy might be expected to grow. 

Attention is focused here on the case in which each generation displays 
a limited degree of altruism toward its immediate successor and no 
altruism at all toward more distant generations. This is, perhaps, the 
simplest setting in which growth is possible and yet different generations 
have conflicting interests. 

The problem that has to be analyzed is that a generation, in bequeathing 
capital to its heirs, would like them to consume all of the capital; yet it 
realizes that the heirs, out of their altuism to their own heirs, would not 
do so. The analysis is carried out by introducing a game-theoretic equi¬ 
librium notion which we call an “equilibrium consumption schedule.” 
The main result is that the economy will either exhibit everlasting growth 

♦Prepared under Contract No. N00014-67-A-0298-0019, Project No. NR 047-004, 
for the Office of Naval Research. By acceptance of this article, the publisher or recipient 
acknowledges the U.S. Government's right to retain a nonexclusive, royalty-free license 
in and to any copyright covering the article. 

1 

Copyright © 1976 by Academic Preas, Inc. 

All rights of reproduction in any form reserved. 



2 


ELON KOHLBERG 


or everlasting decay, depending on the magnitude of the degree of altruism 
and of the return on the capital investment. 

Two desirable properties of any equilibrium notion are, of course, 
existence and uniqueness. We prove uniqueness of the equilibrium con¬ 
sumption schedule but are unable to find reasonable conditions that 
guarantee existence. This is left as an open problem. Still, for several 
important special cases, we explicitly calculate the equilibrium con¬ 
sumption schedule and note some of its properties. 

The model that we study is due to Arrow [1] and Dasgupta [3], Both 
Arrow and Dasgupta emphasize the problem of optimal growth, using 
as a criterion for optimality Rawls' principle of just saving. In contrast, 
here we try to find the growth that would actually take place rather than 
that which ought to take place. 

The results of Arrow and Dasgupta and those reported here seem to 
suggest that optimal growth and “equilibrium” growth have markedly 
different properties. 

The notion of “equilibrium consumption schedule” is essentially the 
same as the notion of “equilibrium saving” introduced by Phelps and 
Poliak [4, Sect. IV] and further studied by Phelps [5], However, Phelps 
and Poliak confine their attention to linear consumption schedules and 
to the special utility function u(k) - k A (in this connection, see our 
Theorem 7). 


I. The Model 


We consider an economy in which there is only one good and within 
each generation there is only one individual, The one good can either be 
consumed or used as capital which bears a return. Let k t be the accu¬ 
mulated capital at the beginning of time period t, and let c, be the con¬ 
sumption of individual t (the individual living in that period). The 
remainder, k t c, , is used in production. Each unit so used yields a. > 0 
units at the beginning of the next time period, so that 

kt +1 - <*(*! — c,). (1) 

The preferences of individual t are represented by the utility function 

m(c,) 4- $u(c u j), (2) 

where u is an increasing, continuously differentiable and strictly concave 
function, and 0 > 0 is a '‘measure of altruism.” 



ECONOMIC GROWTH WITH ALTRUISM 


3 


We also assume 

> 1. , (3) 

Arrow [1, p. 329] calls an economy that satisfies this condition “utility 
productive” for the following reason. In a two period version of our 
model, (3) means that altruism and productivity are large enough to 
induce individual 1 to bequeath to individual 2 an amount exceeding his 
own consumption. 


2. Consumption Schedules 

Assuming that behavior is governed by utility maximization, individual 
1 should choose his consumption so as to maximize his utility 
«(c i) + /Jw(c 2 ); clearly, he should take into account that c 2 will be deter¬ 
mined by individual 2 so as to maximize his own utility w(c 2 ) + Pu(c 3 ), etc. 

Thus, individual 1 must determine the c x that maximizes 

u(c x ) 4- pu{c[^(k l - c,)]}, 

where c(k) is the optimum consumption of individual 2, given that the 
amount of capital bequest is k. 

To carry out this maximization, individual 1 must know the whole 
function c(k)\ that is, he must know individual 2’s consumption schedule. 

Since all the individuals in our model have identical utility functions, 
it is natural to look for a solution in which all individuals have the same 
consumption schedule. We are thus led to the following definition. 

A continuously differentiable function c{k) is an equilibrium consumption 
schedule if, for all 0 < k < oo, 

Max {w(ci) + pu{c[tx(k — c,)]} (*) 

'-<,k 


is obtained at c(k). 

An equilibrium consumption schedule may be thought of as a self- 
enforcing norm of behavior for society or, alternatively, as a game- 
theoretic equilibrium point. 


3. The Main Results 

The first result is that an equilibrium consumption schedule must have 
a marginal propensity to consume between 0 and 1. 



4 


ELON KOHLBERG 


Theorem 1. If c is an equilibrium consumption schedule then 0 < c ' < 1. 

Suppose an equilibrium consumption schedule is adopted by society. 
How, then, will the economy grow? From Eq. (1) it is obvious that the 
capital stock increases or decreases according to whether or not 
c(k) < c{k), where 

c(k) = [(a - l)/a] k 

is that level of consumption that maintains the stock of capital intact. 

For example, if c has a graph as in Fig. 1, then regardless of the initial 
capital, the resulting sequence of capital stocks {£,}“ must converge to 
the steady-state k ; and if c has a graph as in Fig. 2, then the sequence {£,} 
must converge to either one of the steady-states or k 2 , according to 
whether the initial capital is less than or greater than k 0 . 

We now claim that an equilibrium consumption schedule can never have 
a graph as in Fig. 1 or 2, and that convergence to a steady-state is, 
essentially, impossible. 

Theorem 2. Any equilibrium consumption schedule must have one of 
the following three forms : 

(i) c(k) < c(k) for all 0 < k < oo. Every resulting sequence of 
capital stocks {k t } increases to oo, regardless of initial capital. 






ECONOMIC GROWTH WITH ALTRUISM 


5 


> 

(ii) c(k) > c(k) for all 0 < k < oo, Every resulting sequence of 
capital stocks {fcj decreases to 0, regardless of initial capital. 

(iii) c(k) = c(k). Every resulting sequence of capital stocks {it,} 
is constant : k t = k t . 

Moreover , (i), (ii), or (iii) Ao/cfr according to whether (a — \)fl is >1, 
<1 or =1. 

As accords with intuition, the economy grows when the return on 
invested capital and the measure of altruism are sufficiently high. Also, 
note that the conditions of the theorem are independent of the function u. 
We will further discuss the meaning of Theorem 2 in Section 10. 

Our second result is a uniqueness theorem: 

Theorem 3. There is at most one equilibrium consumption schedule. 

A somewhat surprising result is that there is no corresponding existence 
theorem: We will present an example in which there is no equilibrium con¬ 
sumption schedule. 

We were unable to find reasonable conditions that will guarantee 
existence. We conjecture that one such condition is that the measure of 
risk aversion —u’/u' be a nonincreasing function (that this is, indeed, a 
plausible assumption is explained in [2, p. 96]). 


4. Proof of Theorem 1 

Lemma 1. Let c be an equilibrium consumption schedule. Then for 


every 0 < k < oo, 




u'[c(k)] = a/3u'[c(x)] c\x). 

(4) 

where 




x = x(k) — a[k — c(fc)]. 

(5) 


Proof. The above is just the statement that, at the maximum point 
c(k). the derivative (with respect to the variable Cj) of (*) is zero. Of 
course, we have to make sure that the maximum is not attained at a corner. 
This is done in the Appendix. Q.E.D. 

Clearly, c(0) = 0 and x(0) = 0. Letting k -*■ 0 in (4), we obtain 
«'(0) =, ocpu'( 0) c'(0); that is 


c'(0) = 1/ajS. 


(6) 


Lemma 2. x(k) ~ a(k — c[k]) is strictly monotone increasing. 



6 


ELON KOHLBERG 


Proof. Differentiating (5) and using (6) we get x'{0) = a{l - (i/«0)]. 
By (3), at'(O) > O' So it is sufficient to prove that x{k) is one-to-one. 

Suppose x(k ,) = x(kj. By (4), = u'(c[k^ and since u is 

strictly decreasing c(A,) = c(k 2 ). It now follows from (5) that k t = k t . 

Q.E.D. 

Applying Lemma 2 we have 0 s? x'(k) — afl — c (k)), hence c ( k ) sj 1. 
Also, by (4), c' > 0. This proves Theorem I. 


5. Proof of Theorem 2 

Suppose (a - 1) ft > 1. Then l/aj3 < (a - 1 )/a. By (6), c(A) is initially 
(near 0) below c(k) [(a — l)/at] k. We claim that c(k) must stay below 
c(k). Indeed, otherwise there must be some point k 0 > 0 at which c(A) 
and c(A) intersect for the first time. Since c(k) intersects c(k) from below, 
c'(A 0 ) >(a — 1 )/a. On the other hand, x(k 0 ) = a(k 0 — [(“ — l)/a] k 0 ) = k 0 
so that, by (4), c'{k„) — l/a)3. It follows that l/aj3 > (<x — 1 )/a, a contra¬ 
diction. 

Let {A,} be a sequence of capital stocks that results from the adoption 
of c as consumption schedule. As we have just seen, c(k) < c(A) for all 
k > 0, so {A,} is monotonically increasing, and therefore it converges to 
some A „. We claim that A„ - oo. Otherwise, it would follow from (1) 
that fc,} converges to some c 0 such that k 0 — a(k Q — c 0 ); that is, (A 0 , r 0 ) 
is on the graph of c. Since ( k t , c,). which are on the graph of c, converge 
to (A„, c 0 ), it follows that c and c intersect at k„ , a contradiction. 

Very similar arguments show that (« — l)/3 < 1 case (ii) and 
(a — I) jS =- 1 -*> case (iii). 


6. Proof of Theorem 3 

For the sake of simplicity, we will prove Theorem 2 only for the case 
(«- l)/3 > 1. 

We start by rewriting Eq. (4) as a differential equation: 

C W j?[d.vjT (initml condition c (°) = 0), (7) 


where k — k(x) is the solution of x -- a {k — c[A]). 

Lemma 2 shows that k(x) is well defined. 

Remark. Equation (7) may be demonstrated in a diagram (see Fig. 3). 



ECONOMIC GROWTH WITH ALTRUISM 


7 



Given x, we first find k(x) by moving up to the line c(k) and then along 
a 45° line until we intersect the graph of c(k) (the set {{k, c): ot(k — c) — x} 
is a 45“ line that includes both the point ( k(x ), c[/c(x)]) and the point 
( x , c[x])). We may then perform the computation on the right-hand 
side of (7). 

Clearly, (7) is not an ordinary differential equation, since c'(x) depends 
not only on x and on c(*) but also on the value that the function c takes 
on in the entire interval [0, jc]. 

Let Cj(jc) and c 2 (x) be two solutions of Eq. (7), and let k t (x) and k 2 (x) 
be the corresponding k functions. 

Figure 4 illustrates that if we restrict our attention to some neighborhood 
of 0 in which the slopes of the c, functions are bounded away from 1 
(such a neighborhood exists since, by (6) and (3), c/(0) < 1), then, for 
any x in that neighborhood, 

I Ci(*iM) - c 2 (k 2 [x])\ < K' | CjfoM) ~ ^(kjM)!, 
where K' is some constant. 




8 


ELON KOHLBERG 


If we now define 


Vlx) = 


l u'UHx})] 
a/3 u'foM) ’ 


i = 1,2, 


then it follows from the above inequality and from the continuous differen¬ 
tiability of u' that 

Max | Wy(x) - r)[ < K Max | c,(x) - c 2 (x)| 

for all sufficiently small c, where K is some constant that does not depend 
on €. 

Let 0 < x -< e. By (7) and the above inequality, 


1 c s (x) - c,(x)\ + J* i ¥i(x) - )P 2 (x)| dx < eX 0 Max | c x (x) - c 2 (x)|. 

If we choose « so that eAT 1/2 then we have 

Max | c/x) - c 2 (x)| < 1/2 Max | c,(x) — c 2 (x)| 

and it follows that c/x) :x c 2 (x) on [0, e]. 

We have demonstrated that (7) has a unique solution, say c t (x), in 
[0, «]. Let us now concentrate on the interval [e, x(e)]. By Lemma 2, 
the function k(x ) on this interval is determined by the values that c takes 
on in [0, «]. Thus, k(x) is determined, say k(x) = k 1 (x), on [e, x(e)], and 
(7) becomes an ordinary differential equation on [e, x(e)]: 


c'(x) = 


1 

«'(c[x]) 


- fix, c[x]). 


Since u' is continuously differentiable,/is Lipschitz in its second variable 
and it follows that the solution is uniquely determined in [e, x(e)J. 

To complete the proof of Theorem 3 all we need is to show that the 
sequence « 0 = e, = x(e), e 2 = x( cj,..., x n = x(e n ^),--- goes up to 
f oo. Recall that we have assumed (in this section) that (a — I) 0 > 1. 
By Theorem 2, 

x(k) > k for all k (8) 

and it follows that {e n } is monotonically increasing. If {e„} does not go 
up to -f-oo then it has a finite limit, say e. But then x(e„) = «r n+1 e 
hence x(e) = e, a contradiction to (8). 



ECONOMIC GROWTH WITH ALTRUISM 


9 


7. A Counterexample 

By Theorem J, every equilibrium consumption schedule must satisfy 
c'(k) < 1 for all k. (9) 

On the other hand, every equilibrium consumption schedule must be 
the solution of the differential equation (7). 

To create a counterexample to an existence theorem, it is thus sufficient 
to construct a function u such that the resulting solution to the differential 
equation (7) will have slope exceeding 1. 

Let us describe such a construction for the case (a — 1) /? > 1. First 
we define u on some interval [0, a] in such a way that u' decreases very 
slowly. This will make the right-hand side of (7) stay close to 1/a/?, thus 
larger than (a — 1 )/a. As a result, x — k(x) will become very large. Next, 
we define u on some interval [a, b] in such a way that u decreases very 
rapidly. Now, as x increases from a to b, u'(c[x]) decreases rapidly; 
at the same time k(x) is still (for a long while) in [0, a], so u'(c[fc]) decreases 
very slowly. It is thus fairly easy for the right-hand side of (7) to exceed 1. 

The above discussion might also explain why we conjecture that a 
decreasing measure of risk aversion —u"lu' should ensure the existence of 
an equilibrium consumption schedule. 


8. Special Cases 

A natural question to ask is, When is the equilibrium consumption 
schedule linear? The answer is, essentially, never. 

Theorem 4. Unless (a — 1) jS = 1, the equilibrium consumption 
schedule is never linear. 

Proof. Since c'( 0) — 1 /«/?, the only candidate for a linear equilibrium 
consumption schedule is c(k) = (1/a/?) k. But then if follows from (4) 
that u'{c[k]) = i/(c[*]), hence c{k) = c(x). Since c is linear with nonzero 
slope, it follows that k = x. But then we conclude from (5) that the 
(constant) slope of c is (a — l)/a. Thus (a — l)/a = l/aj8. 

Remark. If we were willing to assume that both u and c are twice 
differentiable, then we could differentiate Eq. (7) and get 

c'(0) = [!/(«/?)W - !)][(« ~ 1) P ~ 1]. 

Thus, c(x) is convex or concave in a neighborhood of 0 according to 
whether (a — 1) /? is >1 or <1. 



10 


ELON KOHLBERG 


The next question we ask is, When is the equilibrium consumption 
schedule asymptotically linear? 

Theorem 5. Assume that (a - 1) (3 ^ 1. Then the only function u for 
which the equilibrium consumption schedule is asymptotically linear is u(k) = 
1 - e~ Tk . The linear asymptote is [(a - k - (1/r) • {[log(a - 1) /?]/(« - 1)}. 

Rather than go into the details of the proof (which is analogous to the 
proof of Theorem 6, but more cumbersome) let us make a few remarks 
regarding the above theorem. 

As expected, the asymptote lies above or below the line [(a — l)/a] k 
according to whether (a — 1) fl < 1 or (a — 1) /3 > 1. 

When (a — 1) > 1, the economy expands (asymptotically) at a 

constant rate. That is, any resulting sequence of capital stocks {k t } satisfies 
k,+ l - k t ss [a/(a — 1)] log[(a — 1) $ • (1/r). It is interesting to note 
that the expansion rate is inversely proportional to r. 

More detailed examination shows that the graph of c(k) displays 
damped oscillations around the asymptote. 

With the aid of Theorem 5, we may write down a formula for a function 
u that satisfies all our assumptions and forwhich there exists no equilibrium 
consumption schedule: 

u(k) ~ f e~ *‘ dx. 

*0 

Let (a — 1) 1 and suppose that c is an equilibrium consumption 

schedule. Substituting the above formula in Eq. (7), we get 

C'(.V) (l/«j8) e c(r,:! - rU)J — (]/,*/3) e [<'(*>-c(U][r< l r)H ; <*>] 

Clearly, c'(x) ^ l/a£ so that lim^*, [c(;r) -f c(fc)] = oo; but c'(x ) < 1 
[see (9)], hence 

hm [c(.v) ~ e(Ar)] -c 0. 

It is easily verified that this last condition implies that c(k) must be 
asymptotic to the line [(a — l)/a] k. This is a contradiction to Theorem 5. 


9. Relaxed Conditions on u 

In this section we change slightly the underlying assumptions of our 

mo e. Specifically, we no longer assume that u is differentiable (or even 
defined) at 0. 



ECONOMIC GROWTH WITH ALTRUISM 


n 


Theorem 7. Assume (a — l)/3 tM. The equilibrium consumption 
schedule is linear if and only if u is either k K (\ < 0) or log k. 

Proof Suppose c(k) = ak is an equilibrium consumption schedule. 
From (4) and (5) it follows that 

u'(ak) = otfiau'(aa(l — a) k ). 

But note that the only continuous function that satisfies an equation 
f{rx) = bf(x) for all x is f(x) = Const. x logr!> . Thus 

u\k) = Const. k- l0 **^° aa . Q.E.D. 

Let us now assume that u is of the form k A or log k, and let us find 
the linear consumption schedule c(k) = ak. To do this, we must find a 
by solving the equation. 


logod-a) otfia = 1 - A. 

Denote the left-hand side of this equation by g(a). It is easily verified 
that g is negative except in the interval between l/a/3 and (a — 1 )/a, that 
g is monotone in that interval, and that g(l/aj3) = 0, g[(a — !)/<*] = +oo. 
It follows that g assumes the positive value I — A exactly once, at a point 
between 1/a/3 and (a — l)/a. 

We have thus established that c(k) ~ ak lies below or above the line 
c{k) according to whether l/a/3 < (a — l)/a or l/a/3 > (a — J)/a. That is. 

Theorem 8. The statements in Theorem 2 are valid for all linear 
equilibrium consumption schedules. 

Remark. The equilibrium consumption schedule for u(k) — log k is 
c(k) = [1/(1 + /?)] k. 

This is readily seen by solving the equation 

logcd-o) — L 


10. Discussion of the Results 

The most important aspect of the model under study is that it exhibits 
disagreement between generations as to the relative importance of con¬ 
sumption in different time periods in the future. 



12 


ELON KOHLBERG 


To isolate the above aspect, let us compare our model to one in which 
there is no such disagreement. Specifically, let us consider a model where 
(2) is replaced by 

f < 2 ') 

*-i 

In this model, individual 1 does not have to deal with the problem that 
individual 2 might not consume as much as he (individual 1) would want 
him to. In fact, let fc,}f be the consumption sequence that maximizes 
individual I’s utility; then {c ( }* maximizes individual 2’s utility, {c t }% 
maximizes individual 3’s utility, etc. This sequence {c<} is determined by 
the condition u'(c,) — n^u'(c M ). Thus, the condition for growth is 
a/3 > 1. 

Theorem 2 states that in our original model, the condition for growth 
is (a — 1) /3 > 1. This is a quantitative statement of our intuitive feeling 
that the growth of the economy is impeded by intergenerational 
disagreement. 


Appendix 

We will prove that the maximum in (*) cannot be attained at a corner, 

0 or k. 

Assume c(k) a 0 in a neighborhood of 0. Then, for a sufficiently small 
k, both k and x(k) are in that neighborhood. This is certainly impossible 
(an individual who starts out with k will not decide to consume nothing 
if he knew that his heir would consume nothing of what was left to him). 

Assume c(k) k in a neighborhood of 0. Since the max in (*) is 
obtained at the right-hand corner, 

u'(c[k]) ^ ocpu'(c[x]) c'(x) (10) 

for sufficiently small k. Substituting c(k) = k, x — x(k) = 0, and 
c'M - 1, we get 

u'(k) > a/3u'(0) > «'(0), 

a contradiction to the concavity of u. 

Since c is continuous, it follows from the above two paragraphs that 
there is no neighborhood of 0 in which c takes on just the values 0 or k. 

So there must be a sequence {k n } that converges to 0 and which consists of f 
points for which (4) holds. Going to the limit we obtain 


c'(0) = ]/ aj S. 


(6) 



ECONOMIC GROWTH WITH ALTRUISM 


13 


Now we can prove that c(k) # k when k > 0. Substituting c(k) = k, 
x — x{k) = 0 and c'(0) = l/a/3 in (10) we obtain u\k) ^ a'(0), a contra¬ 
diction to the strict concavity of u. ' , 

We conclude that the max in (*) is either interior or at the left corner (0); 
in any case, u'(c[k\) < a/?i/(c[x]) c'(x) so that 


c'(x) > 


1 u\c[k\) 

Ii'(c[xj) 


and c is strictly monotone increasing. The 
k > 0. 


>0 

proves that c(fc) > 0 when 


Acknowledgments 

I am indebted to Frank Hahn for calling my attention to the work of Phelps and 
Poliak [4], I am also indebted to Kenneth Arrow, Truman Bewley, and, especially, 
Steven Shavell, for helpful discussions. 


References 

1. K. J. Arrow, Rawls" principle of just saving, Swedish J. Econ. 75 (1973). 

2. K. J. Arrow, “Theory of Risk-Bearing,” Markham, Chicago, 1971. 

3. P. Dasgupta, On some problems arising from Professor Rawls’ conception of 
distributive justice. Theory and Decision, 4 (April 1974). 

4. E. S. Phelps and R. A. Pollak, On second-best national saving and game-equilib¬ 
rium growth, Rev. Econ. Studies 35 (April, 1968). 

5. E. S. Phelps, The indeterminacy of game-equilibrium growth in the absence of an 
ethic, in “Altruism, Morality, and Economic Theory” (E. S. Phelps, Ed.), Russell 
Sage Foundation, New York, 1975. 



JOURNAL OF ECONOMIC THEORY 13, 14-25 (1976) 


The Probability of the Paradox of Voting: 

A Computable Solution* 

William V. Gehrlein and Peter C. Fishburn 

The Pennsylvania Stale University , College of Business Administration, 
University Park , Pennsylvania 16802 

Received July 10, 1975 


1. Introduction 

Condorcet's voting paradox occurs when every alternative in an election 
can be beaten by some other alternative on the basis of sincere simple 
majority voting. Studies of the paradox often presume that the number m 
of alternatives is finite and that there is an odd number n of voters each 
of whom has one of the m\ linear orders as his preference order on the 
alternatives. In addition to these presumptions, we shall assume-for 
purposes of computing the probability of Condorcet’s paradox— that there 
is a vector of probabilities q — (< 7 , , q 2 q ml ) with £ q k = 1 such that 
each voter independently selects the linear order numbered k on the 
alternatives according to probability q k . The impartial culture (IC) 
condition is said to hold when q k — l/ml for all k. Under 1C each of the 
m\ linear orders on the alternatives is equally likely to be held by each 
voter. 

Because n is assumed to be odd and each voter is assumed to have a 
linear preference order, the probability of the paradox is one minus the 
probability that some alternative beats all others on the basis of simple 
majority comparisons. Throughout, we shall let P r (x ,, n; q) be the proba¬ 
bility that .V, beats all other alternatives in {x!,..., x r ) by simple majorities 
when all .v, are distinct, r • ' m, and each voter independently 
selects a linear order on the m alternatives according to q. Then 
1 Xi-o , tv, q) is the paradox probability in this situation. When 
condition 1C is presumed to apply, we shall write F m (x, ,n\q) as P,„(x, , n), 
with P m (x 1 , w) — PJx. 2 , n) — = P m (x m . n) by the symmetry of IC. 

Under IC the paradox probability is 1 - mP m (x 2 , n), and the probability 
that some alternative beats all others is P m {n) — mP m (x 1 , n). 


* This research was supported by the National Science Foundation. 
„ . 14 

Copyright © 1976 by Academic Press, Inc. 

All rights of reproduction in any form reserved. 



VOTER PARADOX 


15 


Previous studies on the probability of Condorcet’s paradox have 
included the consideration of how changes in m and n affect P„(n) [5], 
the development of simulation estimates of P m (rJ) [6, 9], the development 
of methods for approximating the paradox probability 1 — £ P m (x ( , n; q) 
for various q vectors [3, 7, 8], and the determination of exact values of 
P m (n) for small m and n by computer enumeration routines [11]. Analytical 
attempts to find a simple closed-form expression for the paradox proba¬ 
bility in terms of m, n and q have been conducted by DeMeyer and Plott [1J, 
Garman and Karmen [3] and Niemi and Weisberg [8]. However, as noted 
by Kelly [5] and Pomeranz and Weil [9], none of these studies has provided 
a method of calculating the exact probability of the voting paradox in 
any reasonable amount of time for relatively small m and n. In this paper 
we shall develop an exact solution for the paradox probability that is 
easily computable for m and n that are quite large by previous standards. 

In addition, we shall consider recursion relations which express P m (n) 
in terms of the P m -(n) for m' < m [2, 7], Previously, the only known 
relation of this type has been May’s theorem, 1 — P^n) — 2[1 — P 3 (n)], 
or P 3 (n) = 2 P a (n) — 1, which says that the paradox probability under IC 
with four alternatives and n voters is double the paradox probability 
under IC with three alternatives and n voters. In this paper we shall show 
that similar recursion relations exist for all even numbers of alternatives. 
Hence, when m is even, P„(n) equals a linear combination of 
P 3 (n), P b (n),..., P m ~i(n). Similar relations do not appear to exist for odd 
values of m. However, a simple relationship will be given which approxi¬ 
mates P b (n) very closely by an expression of the form 1 —/(«)+ f(n) P 3 (n) 
in which /(«) takes on an elementary form. 

The paper is organized as follows. Section 2 develops an exact relation 
for P 3 (Xi , «; q), and Section 3 does likewise for /**(*<, n\ q). An exact 
expression for P b (n) is presented in Section 4 along with the aforementioned 
approximation of P b {n) in terms of P 3 (n). Section 5 then proves the 
existence of the recursion relations for even values of m. Computations 
based on our analytical developments are provided in tabular form. 
Table I, which lists exact (or approximately exact) values of P m (n ) for 
various ( m , n) pairs significantly extends previous computational results. 


2. Three Alternatives 

Let X be the set of alternatives with #X = m > 3, where #X is the 
cardinality of X, and let M be the set of ml linear preference orders on X. 
We shall let > denote a generic order in M. For any three distinct alter¬ 
natives , x t , and x 3 in X, we partition M into four subsets as follows: 


642/13/1-3 



16 


GEHRLEIN AND FISHBURN 


5! = {> e M: x 2 > x x and x* > xj 
5 j = {> e M: x 2 > Xi > x»} 

5 3 = {> e M: x a > Xj > x*} 

5 4 = {> e M: Xi > x 8 and x x > x»}. 

Given a probability distribution q on A/, let p ( be obtained from q as the 
probability that a voter’s preference order satisfies the conditions of 5,. 
For example, p x is the sum of g(>) for all > in 5 V . Also let n, be the 
number of voters with orders in 5,. 

Under this formulation, x x has a simple majority over each of x 2 and x 3 
if and only if 

+ « 2 < (« — l )/2 

( 2 ) 

«i + < (« — 0 / 2 . 


To find the probability P 2 (x l ,«; q) that x x is the simple majority winner 
within , x 2 , x a j when there are n voters and g obtains, a relation is 
needed to sum the probabilities of all n, configurations that are consistent 
with (2). Observing that (2) holds if and only if 

0 «, < (n — l )/2 

0 < n, < (n — l)/2 — n x (3) 

0 < », < (« - l )/2 - /i,, 

it follows that 


^(* 1 ,«;?)- X 


n! 


/ij! ! /j. 


! n_! 






(4) 


where X 1 is a triple summation whose summation limits are given by ( 3 ). 

Suppose #X — 3. Then the six linear orders in M along witlr their q { 
values can be listed as follows: 


<h ‘ x r > x 2 > x„ 
<h'- x x > x 3 > x 2 

q 3 • X 2 -Vj A., 

It then follows from ( 1 ) that 


q t : x 2 > x 2 > Xj 
Vs - x 3 > x x > x 2 
q<s-x 3 > x 2 > x x 


Pi + 4a P 3 — r/r, 

As = 4a Pi ~ 4, + q s • 


(5) 


When A — (xj, x 2 , x 3 }, the probability /^(Xj, n; g) that x x is the simple 
majority winner is obtained by substituting ( 5 ) into ( 4 ). P 3 (x 2 , n; q) and 





VOTER PARADOX 


17 


i(xg, n\ q) are computed in a similar manner. The paradox probability 
this case is therefore 1 — X<-i > n '< 9) where 


I P 3 (x<, n; q) = £ , - > [fo 4 + ?•)"* + ?,)"* 

1 "l • "2 * '*3 * "4 • 

+ (f, + 9s)” 1 9i*9«*(9* + «7 4 ) Bl 
+ (9x + 9s)" 1 9?'9?‘(9 S + 9g) B *]- (6) 


TABLE I 

Probability of a Simple Majority Winner under IC 
for n Voters and m Alternatives 


n 



m 




3 

4 

5* 

6“ 

7 

8 

3 

0.94444 

0.88889 

0.84000 

0.79778 

0.76120 

0.72925 

5 

0.93056 

0.86111 

0.80048 

0.74865 

0.70424 

0.66588 

7 

0.92498 

0.84997 

0.78467 

0.72908 

0.68168 

0.64090 

9 

0.92202 

0.84405 

0.77628 

0.71873 



11 

0.92019 

0.84037 

0.77108 

0.71231 



13 

0.91893 

0.83786 

0.76753 

0.70794 



15 

0.91802 

0.83604 

0.76496 

0.70476 



17 

0.91733 

0.83466 

0.76300 

0.70235 



19 

0.91678 

0.83357 

0.76146 

0.70046 



21 

0.91635 

0.83269 

0.76023 

0.69895 



23 

0.91599 

0.83197 

0.75921 

0.69769 



25 

0.91568 

0.83137 

0.75835 

0.69664 



27 

0.91543 

0.83085 

0.75763 

0.69575 



29 

0.91521 

0.83041 

0.75700 

0.69498 



31 

0.91501 

0.83003 

0.75646 

0.69431 



33 

0.91484 

0.82969 

0.75598 

0.69373 



35 

0.91470 

0.82939 

0.75556 

0.69321 



37 

0.91456 

0.82913 

0.75519 

0.69275 



39 

0.91444 

0.82889 

0.75485 

0.69234 



41 

0.91434 

0.82867 

0.75455 

0.69196 



43 

0.91424 

0.82848 

0.75427 

0.69162 



45 

0.91415 

0.82830 

0.75402 

0.69132 



47 

0.91407 

0.82814 

0.75379 

0.69104 



49 

0.91399 

0.82799 

0.75358 

0.69078 



Limit 

0.91226 

0.82452 

0.74869 

0.68476 

0.63082 

0.58491 


Values for rt > 23 are computed from the approximation (17). 






18 


GEHRLEIN AND FISHBURN 


Since g, = 1/6 for i = 1,..., 6 under IC, (6) yields 


W = r 


n t ! « 2 ! n s ! n 4 ! 




( 7 ) 


Equation (7) has been used to compute P 3 (n) for each n e {3, 5, 7,..., 49}. 
These values-along with the limiting value for n -* oo from Ruben [10]— 
are shown in the m - 3 column of Table 1. The other limiting values in 
the table are from [10] also. 


3. Four Alternatives 


To determine a simple relation which gives the probability of a simple 
majority winner on four alternatives, the following preference relations 
must be considered for x A e X — {jTj, x . t , jt 3 }: 


S h = {> £ M: > x x ] 
5, =- {> e M: x x > *,}. 


Let n lf be the number of voters with orders in S, n S, and let p h be the 
probability that a voter’s linear preferences order is in S t n 5}. The 
inequalities of (3) require that beats .v 2 and x a . By adding an additional 
set of inequalities rt is possible to bound the feasible configurations of n tj 
such that .v, beats x t also. This additional set of inequalities is given by 


0 < n n ^ «, 

0 < n 2i < n., 

( 9 ) 

0 «35 < Mmjs,; (« — l)/2 — « 16 - n 25 } 

0 < n 45 <: Min{« 4 ; (n - l)/2 - u is - - /r 36 } 

where Min{a; b } is the minimum of a and b. Let n it — «, — n tb for 
/ -• 1, 2, 3, 4, and let r be the probability distribution on M from which 
the Pu are computed. Then the probability P i (x l , n\ r) that ,v, is the simple 
majority winner of the four alternatives is given by 


F 4 (.Vi , n; r) - £ X «! 


n ^ 

1=1,2.3.4 "W 
i-5.6 


( 10 ) 


where X 2 is a quadruple summation whose limits are given by (9). 



VOTER PARADOX 


19 


If #X = 4 then there are 24 linear orders in M. These are listed as 
follows along with their r, values: 


'i : x t > x 2 > xg> x 4 
r 2 : x x > x 2 > x 4 > x 3 
r 3 : > x 3 > x 2 > x* 

' 4 : *4 > * 3 > x 4 > x 3 
r 6 : x 1 >x 4 >x 1 >x» 
' a : Xj > x 4 > x 3 > x 2 
r 7 : x a > x x > x s > x 4 
'„: x 2 > Xj > x, > x 3 


r,: x 2 > x, > x t > x 4 
'io : x 2 > x 3 > x 4 > x x 
'll: x 2 > x« > Xj > x 3 
r xi : x 2 > x 4 > x 3 > x x 
r IS : x 3 > x x > x, > x 4 
'i 4 : X 3 > Xj > x 4 > x 2 
r l5 : x 3 > x t > x t > x 4 
r 14 : x 3 > x 2 > x 4 > x x 


It then follows from (1) and (8) that 


r„ : x 3 > x 4 > x t > x* 
r X8 : x 3 > x 4 > x s > Xj 
r i» : x 4 > x x > x* > x 3 
'so : x 4 > x x > x s > x g 
'n: x, > x 2 > Xi > x 3 
'ss : x 4 > x 2 > x 3 > x x 
' 2S : x 4 > x 3 > x x > x 2 
'* 4 : x 4 > x 3 > x 2 > Xj 


Pl5 

— 'l 0 + '"is + 

'x« + 'is + r n + r M 

Pl8 

= 'is + 'a 

Pas = 'ai 

+ 'n 

Pit 

= '7 + r 8 

Pas = 'aa 

+ r l7 

Paa 

= 'is + r xi 

P« = 'la 

+ 'ao 

Pm 

= 'i + '* + ': 

1 + ' 4 + 's 

, + '8- 


Then P x (x x , n; r) is obtained by substituting (11) into (10), and 
P 4 (x 2 , n; r), P 4 (x 3 , n; r) and P 4 (x 4 , tv, r) can be obtained in a similar 
fashion. The expression for the probability that there is a simple majority 
winner on four alternatives is quite long and will be omitted since it is 
obtainable in a straightforward fashion. It should be emphasized that, 
although these relations are quite long, they are easily handled by a 
computer. In addition, these results are much simpler than results 
presented in previous studies. The degree to which these relations are 
simpler will be pointed out in the next section. 

A simplification of (10) that results from JC is not of particular interest 
since P 4 (n) is expressible in terms of P 3 (n). May’s theorem [2, 7] says that 


P A (n) = 2P 3 («) - 1. (12) 

Values of P 4 («) for each n e {3, 5, 7,..., 49} are included in Table 1. These 
were obtained from (12). 


4. Five Alternatives 

When = 5, M contains 120 linear orders. Therefore, attention will 
be restricted to condition IC in examining the paradox probability for 
five alternatives. The derivation of a closed-form equation for P s (n) 



20 


GEHRLEIN AND FISHBURN 


would be fruitless if a recursion relation, similar to May’s theorem, existed 
for five alternatives. The following theorem proves that such a relation 
does not exist. 

Theorem 1. There do not exist numbers xf such that 

P h (ri) = V + ^P 3 (n) + xfP 4 {n) for all odd n. 

Proof. Assume to the contrary that there do exist a* 5 such that 

P t (n) = V + a 3 6 P 3 (n) + a 4 6 P 4 (n) for all odd n. 

By (12) there must exist numbers ft 6 and ft 6 such that 

P 6 («) = ft 5 + ft 6 ft(«) for all odd n. 

Using the results of Sevcik [11] for P 5 (n) and P 3 (n) with n equal to three 
and then five, 

21/25 - ft 6 + ft 5 17/18 
32019/40000 = ft 6 + ft 6 1809/1944. 

Solving simultaneously, 

A, 5 = -18477/10000 and ft 6 = 14229/5000. 

If the recursion relation holds for all odd n it must hold for n equal to 7. 
Again using the results of Sevcik [11], this would require that 

15253909/19440000 = -18477/10000 + (14229/5000)(258936/279936), 

which is false. 

To develop a closed-form solution for P 5 (n), the preference relations for 
, x t and must be considered for x t , x s e X — {.v,, x 2 , x 3 }-v Partition 
M into four subsets as follows: 


ft' = {> e M: x 4 > ,v x and > .v x } 

ft' = {> e M: * 4 > x-j > jc 6 } 

ft' = {>eM:x 8 >x 1 > x 4 } ^ 

ft' = {> e M: x x > x 4 and x x > x 5 }. 

Let be the number of voters with orders in ft n ft' and let p (i be the 
probability that a linear order is in S ( n S/ when j is the applicable 
probability distribution on M. 

The inequalities of (3) require that x x beats x 2 and x 3 . By adding 



VOTER PARADOX 


21 


additional inequalities, x t can be restricted to beat x t and x t also. The 
additional inequalities that are required are 

0 «ii < n x 

0 < < n 2 

0 ^ «gi ^ Min{« 3 ; (« — l)/2 - n u — « 21 } 

0 < rt n < Min{« 4 ; (« — l)/2 — n u — n 21 — n al } 

0 < n n < Min{« t — « u ; (« — l)/2 — n t '} 

0 < « 22 < Min {« 2 - n 21 ; ( n - l)/2 - n x ' - n 12 } ^ 

0 < n t2 < Min{w 3 — n 31 ; {n — l)/2 — n t ‘ — n n — n 22 } 

0 < a 42 < Min {« 4 — n 41 ; (n — l)/2 — n x ' — n 12 — « 22 — « 32 } 

0 < «i 3 < Min{n x — n u — n 12 ; (« — l)/2 — « x '} 

0 < /i 28 < Min{n 2 — n 21 — n 22 ; (n — l)/2 — n*' — n 13 } 

0 ^ / 7 33 ^ Min{« a « 31 w 32 j (n 1 )/2 w x * n 13 ^ 23 } 

0 < n 43 < Min{n 4 - « 41 - « 42 ; (n - l)/2 - — « la - n 2S — /I 33 } 

with rti = 2 j-i ■ 

Then the probability that Xj is the simple majority winner on the five 
alternatives is given by 


1 s 

P*f,x 1 ,«;*) = ££ n! 


n 


«.y! 


(15) 


where X 8 is composed of 12 summations whose limits are given by (14). 
With the restriction of IC it follows from (1) and (13) that 

Pn Pu =1/5 

P21 ~ P31 ~ Pu — Pi 4 ~ Pis ~ Pis — Pis — Pis ~ 1/20 (16) 

Pi 1 — Pn — Pss = Pss = Pss Pss ~ 1/30. 

Then P s (x lt n) is obtained by substituting (16) into (15) and P s (n) is 
obtained from P- 0 (x 1 ,«) in the obvious way. The evaluation of P b (n) 
in the indicated fashion is fairly complex but is easily computable for 
values of n that are quite large by previous standards. For example, 
P 5 (5) was calculated with (15) and (16) in .33 seconds (IBM 370/168) 
whereas DeMeyer and Plott [1] estimated that their solution, which would 
require the evaluation of functions containing 119 summations, would 
require 300 hours (IBM 7094) to obtain the same value. As a result, we 
see that while the solutions presented in this paper are complex, they are 
much simpler than solutions of previous studies. Table I lists P b (n) for 
each n e {3, 5, 7,..., 21} using (15) and (16). 



22 


GEHRLEIN AND F1SHBURN 


To obtain values of P t (n) for n greater than 21, an approximation, 
was developed for the probabiUty 1 - P t (n) that the voting paradox 
does occur. The approximation is given by 

Q 6 (n) = [2(6/5)* - (ir/40)(« - 3)/(5 n - 2)][1 - P 3 (n)l (17) 

Qb(n) was designed to be exact for n equal to three and to be correct, 
within the limits of table accuracy, for the limiting value in n. Table II 
compares the results of Qc,(n) and 1 — P 6 (n) for each n e (3, 5,..., 21}. 


TABLE 11 

Exact Values of the Probability of the Voting Paradox 
on Five Alternatives under IC, and the 
Approximation Q,(n) 


n 

Exact value 

<?.(«) 

3 

0.16000 

0.16000 

5 

0.19952 

0.19951 

7 

0.21533 

0.21534 

9 

0.22372 

0.22373 

11 

0.22892 

0.22891 

13 

0.23247 

0.23247 

15 

0.23504 

0.23504 

17 

0.23700 

0.23700 

19 

0.23854 

0.23854 

21 

0.23977 

0.23977 

Limit 

0.25131 

0.25131 


The results of Table II indicate that Q 6 (n) is accurate within the range 
of table data for n greater than eleven. Therefore, (17) was used to obtain 
Ps(n) for each n c [23, 25,..., 49} in Table I. 


5. More Than Five Alternatives 

An exact closed-form solution for P„(n ) will not be considered since 
it will be shown that 

P «(”) = 3P 5 (n) — SP 3 (n) + 3, for all odd n. (18) 

In fact, the following theorem shows that similar relations exist for all 
even numbers of alternatives. 




VOTER PARADOX 


23 


Theorem 2. For all even m > 4, there exist numbers a 3 m swcA tAaf 
Fmin) = V" + £ °‘i m F t (n), for all odd n, 

F 

where F — {/: 3 < i < m and i is odd}. 

Proof. We begin with the proof for m = 4. Define P as the probability 
that is not the simple majority winner. P is obtained in two ways as 
follows: 

P = 1 - Pi(x i, n) (19) 

P = £(£, UE.UE^ (20) 

where £(£ 2 uf 3 u £ 4 ) is the probability of the union of events £ 2 , £ s 
and £ 4 with £, defined as the occurrence of x, beating . As described 
by Hogg and Craig [4] 

P(E 2 u £ a u £ 4 ) - P(E 2 ) + £(£») + £(£ 4 ) - £(£ a n £ s ) 

- £(£ 2 n £ 4 ) - P(E 3 n £ 4 ) + £(£ 2 n £ s n £ 4 ). 

(21) 

Under condition 1C, £(f|<«s £,) = £(fW.s E t c ) for any set S with £< c 
defined as the complement of £,. Therefore, (21) reduces to 

P(E t u£,u£ ( )4 (-l) i+1 C*P t¥l ( Xl , n) (22) 

*=»1 

with P t (x 3 ,n) = 1/2 and C,” = n\j[i\(n — *)!]. By equating (19) and 
(22) and reducing, we obtain 

/»«(«) = 2 P 3 (n) - 1, 

which is May’s theorem. 

For even m 3* 6, (19) and (20) generalize to 

P = 1 - PJLx 4 , «) (23) 

/ m \ nt-1 

£ = £ ( U Eij = l CT~\- D <+1 £< + ,(*i, ")• (24) 

By equating (23) and (24) and reducing, we obtain 

m—2 

P m (n) = m/2 + (m/2) £ (-1) 4 Cr _I £ (+1 (Xi, n). 

t-1 


( 25 ) 





© VS CJ 
-00/1 
n «“n| 


Os vs r-« — 
oo so t 

<N —■ ^ 

I m vs 
I -n fn 


^ <N ^ <N 

^ r** ^ «*s 

1^38 


-'SSSSS 

- M h O M 

I ^ 3 £ § 
7 £ ? 


—. *& VO O 00 <N 

os 00 CN 00 r- On 

i « n n - 

I <s r**- m m — 

so © so rs 

| O OO h 

I N - «n 

fS —' 


vs — © © © © ~« 

7 S rrl so © m —. 

! - tt w /i n o> 

7PsS? 

I Oi oo m 

V> s© 

l r*s O' 

I — VI 


2 P 2 

r*s os —• 

00 in M 

so r- so 
so rj oo 

" OO M 

SSS 

Os TJ- 

l V"> 

' SO 


Tf SO P*1 — 
•~ <N 00 c'S 
; — SO © 


M sf - 


—• r-- so 
SO os © 
m »r* 

<N io m 

vs so vs 

f VS r«s 

S 'l 3 : 


00 VS O 

ol «o — oo 

fN St » 
I rn fN 

l so 


© vs r 
r- oo v 
oo & r 
v> r*s o 


«3ONst\0t»ON 
— — <N H 




VOTER PARADOX 


25 


By starting with P m (n ) for the desired m, as in (25), and sequentially 
replacing the P i+1 terms for which i + 1 is even and less than m, from 
largest to smallest, an expression for P m (n) as a linear combination of 
Pj(n) for odd j < m is, obtained for all odd n. 

It appears that the a/" numbers are always integer valued, but this has 
not been proved. A computer was programmed to calculate the <x ( m 
numbers for each m e{4, 6 ,..., 24}. The results are shown in Table III. 

P 6 (n) values were calculated from (18) for each n e {3, 5,..., 49} by using 
the corresponding values of P 3 (n ) and P s (n) in Table I. In addition. 
Table I contains P 7 (n) and P g (n) values for each n e {3, 5, 7}. The P 7 (n) 
values are from Sevcik [11] and the P 3 (n) values are from the recursion 
relation for P 3 (n) in Table III and the corresponding P 3 (n), P 6 (n) and P 7 (n) 
values. 

The procedures that were used to go from an exact closed-form solution 
for P 3 (n) to P b (n) can be used to find exact closed-form solutions for 
P 7 (n), P a (n),.... However, no additional relations will be developed in this 
paper. 


References 

1. F. DeMeyer and C. R. Plott, The probability of a cyclical majority, Econometrica 
38 (1970), 345-354. 

2. P. C. Fishburn, A proof of May’s theorem P(m, 4) — 2P(m, 3), Behav. Sci. 
18 (1973), 212. 

3. M. Garman and M. Kamien, The paradox of voting: probability calculations, 
Behav. Sci. 13 (1968), 306-316. 

4. R. V. Hogg and A. T. Craig, “Introduction to Mathematical Statistics,” 2nd ed., 
Macmillan, New York, 1965. 

5. J. S. Kelly, Voting anomalies, the number of voters, and the number of alternatives, 
Econoipctrica 42 (1974), 239-251. 

6. D. Klahr, A computer simulation of the paradox of voting, Amer, Pol. Sci. Rev. 
60 (1966), 384-390. 

7. R. M. May, Some mathematical remarks on the paradox of voting, Behav. Sci. 
16 (1971), 143-151. 

8. R. G. Niemi and H. F. Weisberg, A mathematical solution to the probability 
of the paradox of voting, Behav. Sci. 13 (1968), 317-323. 

9. J. E. Pomeranz and R, L. Weil, The cyclical majority problem. Comm. ACM 
13 (1970), 251-254. 

10. H. Ruben, On the moments of order statistics in samples from normal populations, 
Biometrika 41 (1954), 200-227. 

11. K. E. Sevcik, Exact probabilities of a voter’s paradox through seven issues and 
seven judges, U. of Chicago, Institute for Computer Research Quarterly Report 
22 (1969), Section III-B. 



journal of economic theory 13, 26-46 (1976) 


On Efficiency and Pareto Optimality of Competitive Programs 
in Closed Multisector Models* 

Mukul Majumdar 

Department of Economics, Cornell University, Ithaca, New York 14853 

Tap an Mitra 

Department of Economics, University of Rochester, Rochester, New York 14627 

AND 

Daniel McFadden 

Department of Economics, University of California, Berkeley, California 94720 
Received August 8, 1975 


I. Introduction 

One of the well-known paradoxes of infinity is the possibility that a 
competitive program is inefficient, such inefficiency being linked to over¬ 
accumulation of capital. Recognition of the serious implications of this 
fact has led to attempts to derive conditions that can isolate completely 
the set of efficient competitive programs. However, these conditions seem 
to depend on rather specific properties of the technology, and even among 
the simpler economic models there are basic qualitative differences in 
the criteria for characterizing completely the set of efficient competitive 
programs. Nevertheless, in order to gain a proper understanding of the 

* An earlier version of the paper was presented at the Mathematical Social Science 
Board Colloquium on Mathematical Economics that was held in Berkeley during 
August 1974. Thanks are due to the participants in the session and to A. Bose, R. 
Radner, L. W. McKenzie, D. Cass, and K, Shell for helpful conversations. Detailed 
comments of Professor Cass on earlier versions have resulted in sharper results and 
fewer misleading and erroneous statements. The Group for the Applications of Mathe¬ 
matics and Statistics to Economics, University of California, Berkeley, generously 
provided Mukul Majumdar with facilities that helped collaboration. Research was 
supported in part by National Science Foundation Grant No. SOC72-05551A02 to 
the University of California, Berkeley, and by Grant No. GS44279 to Cornell 
University. 

26 

Copyright © 1976 by Academic Press, Inc. 

All rights of reproduction in any form reserved. 



CLOSED MULTISECTOR MODELS 


27 


role of prices in guiding resource allocation over time in a decentralized 
or a centrally planned economy, it is essential that we have easily applicable 
criteria identifying the efficient competitive programs, at least for the 
more important models of intertemporal resource allocation. For the 
neoclassical model a fundamental result has recently been established 
by Cass [2], and his characterization has been extended to some ultisector 
open models [3] (see also [18]). No parallel study has yet been undertaken 
for closed multisector models, i.e., models in which all commodities are 
producible. The first purpose of this paper is to point out that for a large 
class of closed multisector models, a program is efficient if and only if 
it satisfies the intertemporal profit maximization condition relative to 
a nonnull sequence of nonnegative 1 price vectors and the transversality 
condition that the values of inputs at these competitive prices goes to zero. 
It is immediately seen (in Sect. IV) that this characterization is quite 
different from the one obtained by Cass. The Cass criterion does not 
apply to our framework, just as our condition need not necessarily be 
satisfied by an efficient program in an open model. The class of models 
considered in this paper includes, in particular, those of Dorfman, 
Samuelson, and Solow [5] and McKenzie [15], in which the technology 
permits output substitution, as well as those of Samuelson and Solow [24], 
Morishima [19], and Nikaido [20], in which there is the possibility of 
input substitution. In view of the extensive use of such closed models in 
theoretical and empirical literature on growth and planning (see, for 
example, [4, 22, 26]), and the fact that our substitution assumptions are 
perhaps the most commonplace ones in economic theory, as reflected 
in the “usual” shapes of isoquants and production possibility curves, 
a unified and systematic presentation providing a complete characteri¬ 
zation of efficient programs in these models will hopefully be of some 
interest. 

It is known from [17] that a sufficient condition for efficiency is the 
existence of a sequence of strictly positive competitive prices relative to 
which the transversality condition holds. However, even in finite 
dimensions and with output substitution in the technology an efficient 
program need not have strictly positive competitive prices, and for infinite 
programs even when the prices happen to be strictly positive, the trans¬ 
versality condition does not necessarily hold in open models (recall the 
“golden rule” examples!). The important fact that with substitution 
possibilities in closed models, in which there is a strictly positive vector of 
von Neumann stocks, nonnegative competitive prices together with the 

* An m-vector x = (x‘) is nonnegative (written x ig 0) if x { ^ 0 for all /. It is semi¬ 
positive (written x > 0) if x > 0 and x 0. It is strictly positive (written x^>0) i f 
x' > 0 for all i. A sequence p — (/>,) of m-vectors is normull if p, ^ 0 for at least one t. 



28 


MAJUMDAR, MITRA AND MCPADDEN 


transversality condition are equivalent to efficiency, has not appeared 
in the literature, although the adequacy of the transversality condition in 
signalling capital overaccumulation has often been discussed (see [25, 
P- 273]). 

Assuming the strict positivity of competitive prices, Kurz [12, p. 281] 
and Kurz and Starrett [13, pp. 575, 576] were able to show that the 
Malinvaud prices associated with an efficient program must necessarily 
satisfy the transversality condition when (a) the efficient program is 
“locally contractable” or (b) “productive.” However, one can show that 
an efficient program in a closed model can never satisfy the local contracta- 
bility assumption and, as Kurz himself recognized, the condition of 
productivity is too strong and is not implied by the substitution conditions 
that we shall consider. In our proof of the necessity of transversality in 
the closed model, we show the existence of the system of competitive 
prices supporting the efficient program such that the present value of 
any feasible program is finite and is maximized at the given efficient 
program. Note that present value maximization is stronger than the 
properties usually obtained in the more general framework of 
Malinvaud [17]. 

A second purpose of the paper is to apply our result to characterize 
Pareto optimal programs in a model with overlapping generations and to 
relate the problem of Pareto optimal distribution over time to the problem 
of efficient allocation of resources in this framework. Actually, following 
Samuelson [23], it is conventional to examine the distribution question in 
a “productionless” economy—where the agents have given endowments 
for exchange. We introduce production in a simple way, and in the 
extended model, our main result (under differentiability assumptions) is 
that a program is long-run Pareto optimal if and only if it is short-run 
Pareto optimal and efficient. Thus, roughly speaking, the problem of a 
“proper” distribution of goods is essentially a short-run feature and the 
only long-run problem—the only paradox of infinity—is one of inefficiency 
or capital overaccumulation. This proposition was first proved by Bose [1] 
for a neoclassical framework. Our exercise supplements his work and 
indicates that the proposition is valid, for a more general class of models. 


II. Efficiency in Technologies with Output Substitution 
11a. The Model 

The framework chosen here is the familiar closed model of production 
(see, for example, [20] or [19, Chap. VI], for detailed interpretation). 
Consider an economy in which there are m producible goods. The 



CUBED MULTISECTOR MODELS 


29 


technology does not change over time, and is described by a set T in 
the nonnegative orthant of R Sm — a pair (x, y ) is in ST if and only if it is> 
possible to get the output vector y in period (t + 1) by using the input 
vector x in period t. The following assumptions on are maintained 
throughout this paper : 

(A.l) J" is a closed convex cone in the nomegative orthant of F? m 
( continuity , convexity, and constant returns to scale). 

(A.2) (0, y) e F implies y = 0 (impossibility of free production). 
(A.3) There is (x, y) e with y 0 ( producibility ). 

(A.4) (x, y) e F and x' > x, 0 < y' < y imply O', y') e (free 

disposal). 

As usual, for any (x, y) e dT with x > 0, let AO, y) — max{A: y > Ax}. 
It is known (see [9, p. 338]) that under (A.l) through (A.4), there are 

o, y) e T, A > 0 (A is finite), and a price vector f> > 0 such that 

A — AO, J>), $ — A > A(x, y) for all (x, y) 6 & with x > 0, 

py < A px for all (x, j) e 3". (2.1) 

We follow the usual convention of referring to p as a von Neumann price 
vector , & as a vector of von Neumann stocks, and A as the von Neumann 
growth factor. In what follows, we shall assume without loss of generality, 
that A = 1, in order to simplify notation. Given any -T' satisfying the above 
assumptions, one simply takes the corresponding present value technology 
■T — {(x, y): (x, Ay) e .T'\. has the same structure as , and obviously 
has a maximal Von Neumann growth factor equal to one which is 
is achievable at any vector of input proportions at which A is achievable 
in 3~'. The interested reader is referred to the paper by Winter [27, 

p. 68-9] for details, and is invited to verify that the assumptions made 
below are not in conflict with this convention. Keeping in mind that A = 1, 
the next assumption can be stated simply as 

(A.5) There is some & 0 such that (&, &) e . 

In other words, we assume that there is a strictly positive vector of 
von Neumann stocks. Next, we define a feasible production program from 
x as a sequence (x, y) — (x ( , y, +1 ) such that 

Xq = x, x t y t for all t ^ 1, 


(*t, .Vm) e V 


for all t 5= 0. 


( 2 . 2 ) 



30 


MAJUMDAR, MITRA AND MCFADDEN 


The consumption program c = (c,) generated by (x, y) is defined as. 

c, — y, — x t O 0) for all f > 1. ( 2 - 3 ) 

We refer to (x, y\ c) as a feasible program from x, it being understood that 
(x, y) is a production program and c is the corresponding consumption 
program. A feasible program ( x *, >*, e*) from x is efficient if there is no 
other feasible program (x, y, c) from x such that c t > c* for all t and 
c, > c,* for some /. A feasible program (x*, y*, c*) from x is competitive 
if there is a nonzero sequence (p,*) of nonnegative price vectors such that 
for all t ^ 0 one has 

o - />?,!>’* 1 - p,*Xt* > P* n y - p*x for all (x, y) in ST. (2.4) 

In other words, the intertemporal profit maximization condition (2.4) 
is satisfied for all t. A competitive program (x*, y*, c*) satisfies the 
transversality condition if p,*x ,* goes to zero as t goes to infinity. 

lib. Technologies with Output Substitution 

We now introduce the concept of output substitution. Essentially it is 
required that if it is possible to produce y from x with y' > 0, then 
for any commodity j (^ i) it is also possible to produce more than y i 
of it from x with a suitable reduction of y‘, keeping the outputs of all 
other commodities unchanged. More formally, we have 

(A.6) Suppose that (x, y) e y with y* > 0 for some i. Given any 
j # i and any 8, satisfying 0 < 8, < y\ there exists 8, > 0 such that 
(x, y') € y where y' 1 = y* — 8,, y' J — y 1 + 8,, and y' k — y k for all 
k # ij. 

Note that 8, in general depends on 8,- as well as on the (x, y) under 
consideration. In Fig. 1, technologies (a) and (b) satisfy output substi¬ 
tution, while (c) does not. Using convexity, one can show that if (x, y J e 2T, 
y S 5 y 5* 0, and y l > y' for some /, then by (A.5) there exists y' with 
(x, y') e y and y' y. 2 Two examples of technologies with output 
substitution will be given. 

Example 2.1. The polyhedral T defined as y — {(x, >■): Az ^x, 
Bz > y, z > 0} where A is an n x n strictly positive matrix and B is the 
n x n identity matrix, satisfies (A.6). In general, however, if y is a poly¬ 
hedral convex cone output substitution may not be possible. 

’Since the argument is used repeatedly in our proofs, we spell it out completely. 
Choose 0 < S t <; y< — y‘. For each j ^ i, there exists 5, > 0 such that (x, y — S,o»/ + 
s >"<) 6 ■*". where a >, is a vector with a one in the ith place, zeros elsewhere. By convexity, 
(x,x ) — (l/(m It) (x, y — S,aj, 4- X , and y' y. 



CLOSED MULTISECTOR MODELS 


31 





Example 2.2. Let F be a nonnegative real-valued function on the 
nonnegative orthant of R*™- 1 such that it is continuously differentiable, 
concave, and homogeneous of degree one. Define 

& = {(*, )’)■■ 0 5? y m < Fix 1 ,..., x m , y 1 ,..., y"- 1 )}- 

This is the well-known neoclassical transformation process of Dorfman, 
Samuelson, and Solow [5]. F( ) gives the maximum value of the output 
of the /nth good given the values of its arguments. It will be assumed that 
dFjdx' > 0 for i — 1,..., m and dF/dy' < 0 for / = 1, 2,..., m — 1, and 
verification of the properties listed above is easy. 

While the requirement that the technology satisfies (A.6) may be strong, 
it is clear that (A.6) does not guarantee that for an efficient consumption 
program, 3 the associated prices ( p ,) are strictly positive. The following 
theorem settles the question of relating efficient to competitive programs, 
when the technology satisfies (A.l) through (A.6): 

T heorem 2.1. Under (A.l) through (A. 6) a feasible program (x*, y*, c*) 
from xp- 0 is efficient if and only if there exists a nonnull sequence ip*) 
of nonnegative price I'ectors satisfying for all t — 1 , 2 ,..., 

0 == A-iJ'm ~ Pt*x* P p*+iy — P*x for all (,r, y) e F (2.5) 
and 

lim p t *x t * — 0. (2.6) 

t -»0L> 

Proof. (Sufficiency). Suppose that ( x*, y*, c*) is a feasible program 
from x such that there exists a nonnull sequence ( p t *) of nonnegative 
price vectors satisfying Eqs. (2.5) and (2.6). We have to prove that 

• Consider the example of Arrow (see [8, p. 88, Footnote 52]) where 9" = {(x, y) > 0: 
O' 1 )* + O’)* < [minfx 1 , x*)] 2 }. However the (productive) efficient point y* = (1,0) 
produced from x* = (1,1) can be supported only by the price system p ,i = 1 and 

Put — 0 . 


643/13/1-3 



32 


MAJUMDAR, M1TRA AND MCFADDEN 


(x*, y*, c*) is efficient. To this effect we start by showing that for any 
feasible program (x, y, c) from x one has 

x 

I p,*e, < I Pt*c t * = Po *x. (2-7) 

t~ i f-i 

Feasibility of (x, y, c) from x and Eq. (2.5) guarantee that for any T > 1, 
S T =t Pt*c t < P,*x. (2.8) 

t~ 1 

Nonnegativity of p, and c t implies that S T is a monotonically nondecreasing 
sequence which by (2.8) is bounded above. Hence, lim r _„ S T exists and 
clearly 

lim S T - £ p t *c t <. p 0 *x. (2.9) 

T " fri 

For the particular program (x*, y*, c*) under consideration, one has 
Y,Li P*c* — p*x - Pt*Xt*. Using Eq. (2.6) and taking limits one has 
Eq. (2.7). 

Next, an important property of the competitive prices ( p t *) satisfying 
Eq. (2.5) is noted: 

p* 0 implies p* n = 0 for all t 0; hence p n * > 0; Pi * 0. (2.10) 

This conclusion does not depend on the transversality condition Eq. (2.6). 
To establish Eq. (2.10) note that if (a) p T * = 0 for some T > 1, and 
Pr+i > 0. Eq. (2.5) implies that p*. vX y — p T *x 0 for all (x,y)eJ~. 
By (A.3), />*,.] y - 0 = p T *x, a contradiction. If (b)/? 0 * — Oand/>j* > 0, 
we have again by Eq. (2.5), p x *y — p 0 *x < 0 for all (x, y ) e T. By (A.3) 
there is some 0 such that (x, jSy) e !T. Hence. 0 < /y*/3y *2 Pn*x = 0, 
a contradiction. By (a) and (b), p,* --= 0 implies p* hl = 0 for all t >- 0. 
Since the sequence (/?,*) is not null, /?„* > 0. Finally, since p 0 *x > 0, 
we have by Eq, (2.5), p x *y* — p„*x > 0 which means that p x * 0, 
completing the proof of Eq. (2.10). 

We now come to the proof of the result. Suppose that (x*, y*> f*) is 
not efficient. This means that there is a feasible program (x, y, c) from 
x such that f, 5s c,* for alt t ^ 1 and c t > c,* for some 1 , say t — t' > 1 . 
Either p* = 0 (Case I) or p* > 0 (Case 11). We examine each case in turn 
for a contradiction. 

Case I. Consider the last period t, 1 < r < t', for which p* > 0. 
Since yy > fy > c* > 0, (A.2) implies yy_ x > .i ( _i > 0. Repeating 
this process, we finally get x T > 0. Construct a feasible program (*', y", c") 



CLOSED MULTISECTOR MODELS 


33 


from x with c" t = Z t for t < t, % — Z r + X T , and t\ = 0 for t > r. 
By (A.6), one can construct a feasible program (*', y\ t) from x such that 
? t ' = c'f for t ¥= r and c T ' c*. Then using (2.10), Sui Pt*Zt — 

= Y.7.1 P*c t *, a contradiction of (2.7). 

Case II. Using (A.5), one can construct a program (£', >>', ?') from 
x such that c,' = c x for t ^ t' and c t - :> c* . Since p t > 0, this implies 
XfLi Pt* £ t > T.7~i Pt*c*, contradicting (2.7). Thus, the sufficiency part 
of the theorem is proved. 

(Necessity). An important consequence of (A.5) and (A.6) is that 

the von Neumann price vector p is strictly positive. (2.11) 

Since / is semipositive, for some j one hasp* > 0. Suppose that for some i, 
/■ — 0. By (A.5) we have (£, x)e X where £^>0. By applying (A.6) we 
have (x, y) e X where y* > y j < & and y k — £ k for k # i,j. But 
py — p£ — p‘y l — > 0, contradicting the definition of p (see Eq. (2.1) 

keeping in mind that A — 1). This establishes Eq. (2.11). 

Since by (A.5), (£. £)eX and x !> 0, the following useful property is 
obvious: 

There exists 6 > 0 such that (£, &w) e X where w = (1,..., 1) e R m . 

(2.12) 


Define G = (c = (<-,): c, = y t — x, for all t > I, x 0 = x, (x, , >> (+1 ) e X y 
for all t 5= 0}. Clearly, G contains all feasible consumption programs 
c — (c ( ), which satisfy these properties and the additional requirement 
that c, ^ 0 for all t. By Eq. (2.1) for any feasible consumption program 
c = (c<) one has for all T > 0 

r 

X fct < px - i>x T < /x. (2.13) 

Since p 0 (by Eq. 2.11), for any feasible consumption program c = (c t ) 

oo m 

II c\\ =■ X | c, | < «/x, where I c t j = £ | cf |, (2.14) 

t-i i-i 

where a > 0 is determined by p. 

Let X be the linear space of all sequences c = (c,) such that || c || is 
finite. An element p of X*, the set of all continuous linear functionals on X 
can be represented 4 as a sequence p — (p t ) such that || p il* = sup, | p t |* 


‘See [21, p. 64] or [6,p. 289]. 



34 


MAJUMDAR, MITRA AND MCFADDEN 


is finite, where | p t |* = max, | p* |. Let J = (Gnf)- X\ where X+ 
is the set of all nonnegative sequences in X. X is easily seen to be a convex 
and closed (under pointwise convergence) subset of X , 5 

Let y > 0 be such that yA < x. The program c T = (0,..., 0, dyw, 0,...) 
generated by pure accumulation at the von Neumann growth rate until 
r - 1, followed by the activity given in Eq. (2.12) to yield consumption 
in period r is feasible. Hence c is in X for all r ^ 1. Clearly c° = 0 is 
also in X. 

Consider any c' in X satisfying |i c' || < ay, and define 0, = | c,' \/6y 
for t ^5 1 and 8 0 = 1 — S' 0- Then c' < #tC t ~ c ’ and c" 

is contained in X by convexity and closedness under pointwise con¬ 
vergence. Hence c‘ e X and we have proved that X has an interior point. 

Consider the given efficient program (x*, y*, c*). It is easy to check 
that c* is in the boundary of X. Hence by a separation argument there is 8 
a nonzero continuous linear functional p* = ( p t *) on X satisfying 


£ p*c t * > £ p t *c t for all c = (c t ) in X. (2.15) 

i-i /-i 

Since X contains all nonpositive sequences in X, p,* must be nonnegative 
for each I. The proof will be completed by a demonstration that (2.15) 
implies the competitive condition Eq. (2.5). 

For r > 1, define c,' = c" — c ,* for all / t, t + 1; c 7 ' — c* — x T ; 

” c T ri )\. c T . \ ~ yr+i • One can verify 

that c' results from augmenting production in r by (x T , y T (l ) in X, and 
c" results from reducing production to zero in r. Hence, c' and c’ are in X, 
implying from (2.15) that 

Pr + i>\,\ - P*X T -: 0 for all (x T , y r+1 ) e X, r > 1,. (2.16) 

Pr.i)’*! ~ p T *x T * - 0 for r > 1. (2.17) 

Next consider c defined by c, ~ y\*, c, = 0 for / > 1. Clearly c is in X, 

and S»-i P*<'*- On the other hand, summing 

the conditions (2.17), 

r-i r 

0=1 {ptiylx ~ P,*x*) = £ Pt *c* y Pt * Xt * _ p*y *. 


' One can easily adapt the arguments in [14, Lemma II d. 451 
• See [10, Theorem 14.2J. 



CLOSED MULTISECTOR MODELS 


35 


Hence, 

lim p T *x T * = Pi*y x * ~ Hm £ p/d* = p x *y x * - £ p,*c ( * < 0. 

‘- 1 ‘- 1 (2.18) 

Since p T *x T * > 0, this implies the transversality condition (2.6). 

Note that p t * > 0; otherwise, the argument following Eq. (2.10) would 
imply that the sequence p* is null for a contradiction. The supposition 
that there exists (x, y)e&~ with p x *y > p x *y x * implies the program c' 
in & with Ci' = y, c,' — 0 for t > 1 would contradict (2.15). Thus, 

z* == p*y x * > p*y for all (x, y) e T. (2.19) 

Define the convex set K as 

K — {(x, z) e K* x R: x > 0, z < p x *y for any y such that (x, y) e 9~). 

Notice that Eq. (2.19) implies that (x, z*) is a boundary point of K, which 
clearly has interior points. Hence, there is some nonzero (p, A') in R m x R 
such that 

px + A'z* ^ px + A'z for all (x, z) in K. (2.20) 

Note that (a) A' < 0 is impossible (choose (x, z) with z < z* on the right- 
hand side of Eq. (2.20); (b) p — 0 or p* > 0 is impossible (choose (jS'x, P'y x *) 
in y for a sufficiently large /S’ to contradict Eq. (2.20)); (c) A' — 0 is 
impossible since px ^ px for all x > 0 and — p > 0, x 0 will also 
contradict Eq. (2.20)). Hence, define p 0 * = —(1/A') p. Clearly p„* > 0 
and for any (x, y) e 3T one has p x *y — p 0 *x = p x *y + (px/ A') = (1/A') x 
[A>i*y -f px] < (l/A')[A'z* + px] = z* + (p/A') x = Pi*y x * - p 0 *x = 0. 
This completes the proof of Eq. (2.4) as well as the necessity part of the 
theorem. Q.E.D. 

Remark 1. It should be emphasized that the condition of present value 
maximization (Eq. 2.15) is a result of independent interest and does not 
follow from the well-known alternative approaches leading to the existence 
of Malinvaud prices— in a closed model such prices can be shown to exist 
under assumptions (A.l) through (A.4) see, e.g., [17] or [21]. 

Remark 2. Note that the sufficiency half of the theorem does not 
depend on (A.5), the strict positivity of von Neumann stocks. On the other 
hand, the necessity part of the theorem remains valid if (A.6) is replaced by 

(A.7) p ;> 0, i.e., there is a strictly positive von Neumann price vector. 

It has been shown that the properties (A.5) and (A.7) follow from inde- 
composability and some other assumptions on (see e.g., [20, p. 205; 



36 


MAJUMDAK, M1TRA AND MCFADDEN 


19, p. 180]. Actually, (A.7) holds whenever 0 is a maximal point of the 
closure Z of the convex cone Z = {y — x: (x, y) e 9~}. (See [20, 
pp. ?5-36].) 

In view of the remarks above, and for a convenient organization of the 
proofs of the following sections (in which the assumption of output 
substitution is never made), it is useful to have the following precise 
statement to refer to: 

Theorem 2.2. Under assumptions (A.l) through (A.5) and (A.7), if 
a feasible program ( x*, y*, c*) from x > 0 is efficient, then there is a 
nonnull sequence ip*) of nonnegative price vectors such that for all t ^ 0 

0 = pT +l y* i-P'W > Pm>' ~ Pt*x .for all (x, y) e 3T (2.21) 
and 

lim p,*x t * = 0 (2.22) 

t‘> <* 


111. Some Further Results 
111a. Input Substitution 

In this section we present some results for models in which output 
substitution does not hold. In what follows, however, we do have to 
restrict (see Example 3.2 below) our analysis to efficient programs that are 
“interior” in the sense of using strictly positive input vectors at each date. 
Formally, a feasible probram (x, y, c) from x is an interior program if 
Xt 0 for all t. While the restriction to interior programs is somewhat 
ad hoc, it is weaker than the assumption of Cass [2] requiring a strictly 
positive lower bound on input levels. It is easy to check that if an interior 
program is competitive, the associated price vectors (p,) must satisfy 
p, > 0 for all t? although it is not necessary that p t 0. 8 

First, instead of the assumption of output substitution we consider 
the following assumption of input substitution. 

(A.6 ) Suppose that (x, y)eT and x i > 0 for some i. Given any 
j ¥- i and 8, > 0, there exists 8 { satisfying 0 < 8, < x‘ such that (*', y)sj~ 
where x’ ( - x‘ - 8,, x’> = x< 4 8, and x' k = for all k ¥ ij. 

The following theorem is easily proved: 

’ Let / be the first period for which p, = 0. From Eq. (2.10), t > 0. Then 
0 < = p,y t , for a contradiction. 

Hie example of Arrow (Footnote 2) can be easily modified to show this. See, for 
example, [12, Diagram 1 and discussion, p. 289], 



CLOSED MULTISECTOR MODELS 


37 


Theorem 3.1. Under (A.1) through (A.5), and (A.6'), and interior 
program (x*, y*, c*) from x 0 is efficient if and only if there exists a 
nonnull sequence (/><*) of nonnegative price vectdrs such that for all 
t - 0, 1, 2,..., 

o = pt+iyt+i - P*x* > pf+iy - p*x for all (x, y)ef (3.1) 

and 

lim p t *x t * = 0. (3.2) 

t~*oo 

Proof. (Sufficiency). In view of Malinvaud’s theorem it is enough to 
show, by using (A.6'), that the sequence ( p t *) satisfying (3.1) and (3.2) 
also satisfies p t * ;> 0 for all t. Suppose for some t and i, p,’ = 0. Since 
p t > 0, pf > 0 for some j =£ i and by (A.6') there is ( x , y* +l ) e T where 
x 1 < xf, x‘ > xf and ** = xf k for k i,j. Then pf^yt+i — P*x > 0, 
contradicting Eq. (3.1). 

(Necessity). Use Theorem 2.2, since (A.6') and £ 0 imply pfy>0 

(using exactly the argument leading to the strict positivity of competitive 
prices in the sufficiency part). 

In the discussion above, the restriction to interior programs seems 
unavoidable. It is instructive to look at the following: 

Example 3.1. Let T = {{x, y): y\ y 2 ^ ((x 1 ) 1 / 2 + (x 2 ) 1 / 2 ) 2 /4, x > 0, 
y > 0}, with x = (1, 1). This is a technology satisfying (A.5') but not 
(A.5). The program y x — (1,1), = (1, 0), c t = (0,0) for all f > 2 is seen 

to be inefficient, but satisfies (3.1) and (3.2) with respect to prices 
Pa ~ Pi - (1, 0) and p, = (0, 0) for t 3* 2. 

In connection with the application of our Theorem 3.1 to the “sausage 
machine” technologies of Samuelson and Solow, it should be mentioned 
that the theorem can also be proved if (A.5') is replaced by the condition 
of primitivity appearing in the related literature (see [19, p. 179]). This 
condition requires that for any x > 0 there is a finite sequence 
(x,, y l+1 ) e ,T such that x„ = x, y T > 0. 

Illb. The Polyhedral Case'. A Counterexample 

The substitution conditions discussed above are typically not satisfied 
when the technology is a polyhedral convex cone (i.e., generated by a 
finite set of basic activities, (see, for example, [11, p. 79]). The sufficiency 
half of Theorem 2.1 ceases to be valid, as can be seen from the simplest 
examples. It is natural to inquire whether by using the polyhedral structure 
one can sharpen the necessity half of that theorem to derive a sequence of 



38 


MAJUMDAR, MITRA AND MCFADDEN 


strictly positive cpmpetititve prices supporting an efficient program. 
Together with Malinvaud’s sufficiency theorem, this would then provide 
us with a complete characterization in terms of strictly positive competitive 
prices directly analogous to a standard result applicable to models with 
a finite number of commodities and a technology that is a polyhedral 
convex cone (see [7, pp. 306-308] or [20, pp. 186-187]). The remarks in 
[11, p. Ill, Footnote 1] seem to suggest that this is indeed possible. 

We now give an example of an efficient program in a simple polyhedral 
model with three goods such that there is no sequence of strictly positive 
prices (/>,*) that are competitive. Let ,T = {(*, y) ^ 0, y > Cx}, with 


[2 

l 

°1 


* 

-i 

°1 

' 

2 

0 

, c 1 = 

- i 

1 

0 

Li o 

2. 


L- i 

l 

l 


Consider the program x = x t — (1,1, 1), y t = (3, 3, 3), c t — (2, 2, 2) 
for t ^ 1. This program corresponds to production and consumption in 
von Neumann proportions, and is clearly efficient. Prices must satisfy 
the difference equation equation p,* +1 = Cr'pfi, which has the general 
solution 


P* - r,(mi, 1.0) + c 3 (*)‘(0, 1, -1) + r 3 (l, -1, 0). 

The only solution allowing p t * > 0 for all t is c x > 0, c t — c 3 = 0, 
implying/),* 3 = 0, as was to be demonstrated. Note that/),* = (J) ( , ({)*, 0) 
satisfies the transversality condition as well as the competitive conditions. 

111c. Technological Change 

One can dispense with the assumption that the technology of the 
economy does not change over time. We shall sketch a possible generali¬ 
zation, referring the interested reader to the analysis of McFadden [14] 
for further details. Suppose that (the present value) technology J] at 
date t (= 0, 1,...) satisfies (A.l) through (A.4). The sufficiency part of 
Theorem 2.1 can be established by following the same arguments as 
above if satisfies (A.6) for all t. To obtain an extension of the necessity 
part, the assumptions can be recast in the following manner. Let T* be 
the smallest closed convex cone containing all the (present value) 
technologies T ,. The necessity part is obtained if, in addition to (A.l) 
through (A.4) holding for each , one has 

(A.8) f* has a von Neumann growth rate equal to one ; (0, >■) in 
■T* implies y = 0; and $"*■ contains no sequence (x„, y n ) with y n — x n 
having a nonnegative nonzero limit point. 



CLOSED MULTISECTOR MODELS 


39 


(A.8') There exists a positive scalar o such that for each t commodity 
vectors b* can be found with (w, b 1 ) in f 0 , (b 1 , b*) in ^ (6* -1 , b*) 

in ~T t -i and b 1 > aoi where to = (1,..., 1). " 

It follows from (A.8) that there is / 0 such that p(y — x) = 0 for all 

(x, >’) in 3~*. If technological change is neutral or biased towards balanced 
growth and there is a strictly positive von Neumann ray contained in all 
, then (A.8') holds. 


IV. Open Versus Closed Models 

The results obtained in Sections II, 111a, and IIIc supplement that of 
Majumdar [16] in which the technology need not permit such substitution 
and yet efficiency can be completely characterized in terms of inter¬ 
temporal profit maximization and transversality conditions (see [16]). 
However, this characterization is quite different from that of Cass [2]. 
For the neoclassical model, or for a closed one-good model in which the 
technology is described by a production function / that is strictly concave 
in C 2 , the Cass criterion is applicable to interior programs satisfying 

x ^ x, ^ x > 0 for all t, (4.1) 

where x, is the input (per capita capital in the neoclassical case) at date r. 
The competitive prices ( p,) associated with a program are defined by 
p. — 1 / 77 , with 77, = nl-o/'(*»), 77 0 = 1. Inefficiency of a program 
( x, y, c ) is equivalent to finiteness of lim r ^ a Xt_o 7r i • l n view of the bounds 
(4.1), we easily see that inefficiency of (x, y, c) is equivalent to finiteness of 

lim I V/PtXt)- (4.2) 

t-0 

It is easy to figure out the relation between (4.2) and the transversality 
condition and the essential difference between the two criteria. It would, 
of course, be convenient if one could attribute the qualitative difference 
entirely to the presence or absence of a nonproducible commodity (like 
labor that sets an upper bound on all feasible per capita variables in the 
neoclassical model). That this is, strictly speaking, not the case will be 
clear from the following special example. 

Example 4.1. The Leontief model of Gale [7, pp. 300-301], The 
technology is S' = {(/, x, y) 0, Ay < x, ay < /}, where / is the input 
of “labor” (the nonproducible good that is not consumed), A is an n x n 



40 


MAJUMDAR, MITRA AND MCFADDEN 


positive matrix and a is a strictly positive vector giving the input require¬ 
ments of producible goods and labor respectively. The condition for a 
program (7*, x*, y*, c*) to be competitive is that there exist nonnegative 
sequences (p<*), (w,*) such that 

o - ptxytt - PtW - »*h* > PmVm - P*Xt - *>*U 
for (/,, x,, y, +l ) e This requires 

Pn i ^ Pt * A + M'(*0 

and (p? + i -P*A ~ w,*a)y*+i = 0- If > 0, so that equality holds, 
the general solution of this difference equation is 


p* = p 1 *A t 1 + X wtM 1 

An efficient program satisfies Xj-i — Ay-, — A‘y t with A‘y, -* 0. 
Clearly taking w,* = 0 provides prices for which p*y t — 0 is a criterion 
for efficiency." 


V. An Application to the Problem of 
Characterizing Pareto Optimality 

This section applies some of the previous results to provide a complete 
characterization of Pareto optimal distributions when consumers of oven- 
lapping generations are introduced in the model. The questions of efficient 
allocation of resources in production and Pareto optimal distributions of 
goods among consumers are usually treated distinctly in the literature, 
the former following the lead of Malinvaud, and the latter following 
Samuelson’s pure exchange model. We consider both the questions in a 
multisector model with production. A study of Pareto optimality in which 
for each period there is a utility function u, defined on aggregate con¬ 
sumption c t and Pareto comparisons among alternative programs are 

" The result follows directly from the fact that a program ( x *, y*, c*) is efficient if 
If <* — Fi*. Proof of sufficiency as well as the basic steps in necessity is exactly 
the same as in [16]. To note the only difference, observe that if an efficient program has 
= yt* - 8 with 8 - 0, then y,' = >,* , y/ = zZ.A‘-c,*, x,_, = Ay,' 
for s > 2, Ci' — Ci* + S, c,' = c,* for s > 2 constitute a feasible program. To check 
that the labor constraint is satisfied, y,' = Z.Z,A‘-c,* < y* so that ay,' < ay,* < l,. 
It follows that (c/) dominates ( c,*). 



CLOSED MULTISECTOR MODELS 


41 


made with the utility sequences u t (c t ) is far more obviously related to the 
analysis of efficiency (see [17]). In fact, if there is just one good and the 
utility functions u t are monotonically increasing, then the two problems are 
structurally identical since a consumption program (c,) is efficient if and 
only if the corresponding utility sequence u t (c t ) is Pareto optimal. With 
many goods, a result parallel to Theorem 5.1 also has been worked out 
by us to characterize long-run Pareto optimality in that framework. The 
distributional “paradoxes of infinity” involved in the Samuelsonian 
case of overlapping generations in which Pareto comparisons are made 
with lifetime utilities have been the object of much discussion and, there¬ 
fore, seems to be the more natural framework for a detailed study. The 
technology is that of the closed Dorfman-Samuelson-Solow model in 
which substitution conditions hold, and we indulge in differentiability 
assumptions to keep the exposition simple and to get the sharpest result. 

A more general analysis of Pareto optimality in infinite horizon economies^ 
is undoubtedly of importance, and will be the subject of a forthcoming ,l 
paper. 

Va. The Model with Overlapping Generations 

To keep the notation simple, consumers are assumed to live for ^ 
periods. Those born at the beginning of period t and dying at the eno^af 
period t + 1 constitute the «h generation. We have verified 10 that what 
follows is valid if the consumers are assumed to live for a finite number >2 
periods, and no change in the strategy of the proof is necessary (excepting 
introduction of some more involved notation). The preferences of all 
consumers of a particular generation are alike, and are conveniently 
represented by a real-valued concave strongly monotonic continuous 
utility function t/,( 1 c, 2 c) on (i? 2m ) + ; where ’c — (>c') is the consumption 
vector of the fth generation in the j'th period of its lifetime (j = 1,2). 
The utility function U, is continuously differentiable in the interior of 
(R 2m ) + . The technology used in this section satisfies the assumptions listed 
in Example 3.2, and is described by the DOSSO transformation locus F, 
so that we have 


S' = {(*,.v):0 <F(y' . y” 1 ^; x), x ^ 0}. (5.1) 

We assume that satisfies (A.5). 

The initial stocks x 0 and the consumption of the “old” people in 
period 1, denoted by 2 c 1 are assumed to be given. A feasible program 

10 See the earlier version which appeared as Discussion Paper No. 85 circulated by 
the Department of Economics, Cornell University. 

3.7 591 - 



42 


MAJUMDAR, M1TRA AND MCFADDEN 


(u, x, y, c) from (x, c) consists of nonnegativc sequences ~{u t , (x t ), (y i+1 ), 
(c ul )~ satisfying 

x 0 — x, 2 c, == c; 

y H1 = c M 4 - x M , (x t , y, +1 ) £ S' for all / > 0; (5.2) 

= 2 c (+I + 'cm all t 3* 0; 

and m, = U t Cc t , *c <+1 ) is the sequence of utilities of different generations 
from the proposed schemes of allocation and distribution. 

We restrict our attention to regular interior programs (u , x, y, c ) satis¬ 
fying x t > 0 and *c, > 0 for all t. A feasible program («*, x*, y*, c*) 
is short-run Pareto-optimal if there is no other feasible program (u, x, y, c ) 
such that 


, «, u, , (Y m )] > [a*! , Uj *,..., «,*, (Y* ,)] (5.3) 

for some finite t > 1. It is long-run Pareto-optimal is there is no other 
feasible program (u, x, y, c) with u, > u* for all t Js 1, strict inequality 
being valid for some t. 

It is immediate that a long-run Pareto-optimal program is necessarily 
short-run Pareto-optimal. However, the converse is not true- a program 
such that every finite segment is short-run Pareto-optimal need not be 
long-run Pareto-optimal. This justifies the above distinction. Our objective 
is to identify long-run Pareto-optimality with efficient, short-run Pareto- 
optimal programs, and thus to establish the link between efficiency and 
Pareto-optimality. 

Theorem 5.1. A regular interior program (u*, x*, y*, c*) from (x, c) 
is long-run Pareto-optimal if and only if (a) it is short-run Pareto-optimal 
and (b) efficient. 

Proof. Necessity being obvious, let us go directly to the nontrivial 
sufficiency part. The first step is to note that by using Kuhn-Tucker 
theorem, if a regular interior program ( u *, x*, y*, c*) from (x, c) is short- 
run Pareto-optimal, there is a sequence (q*,p*) of price vectors with 
q,* > 0 (in R) and p t * > 0 (in R m ) such that 


(i) q t *u t * - p* V - ply V +1 > q*V t Cc. Y) - p * V - pt +1 2 c 

for all (Y, 2 c) > 0 and all r > 1, 


(ii) pt y t * - ptyxfy 7 'p t *y - pUx 


for all (x,y)e 3 ' and t > 1. 


* r 


(5.4) 



CLOSED MULTISECTOR MODELS 


43 


Actually, by exploiting the differentiability assumption, the prices 
in Eq. (5.4) can simply be defined as: 


for 


t 5= 2, 


where 


pV = dFI8x 0 * (/ = 1 ,..., m); 

pV = —dF/dyi (i = 1 ,..., m - 1); p*™ = 1 


P? m = 1 /tt, , p,*‘ = - dFjdy i i jir i (i = 1,..., m - 1), 

(5.5) 


= n 8Fldx, m ; 


g„* — q* = 1 and = 0 dU t /8 l c t m )lrr t for t > 2. 

It is understood that all the derivatives in Eq. (5.5) are computed at 
(«*, x*, y*, c*). The computational details leading to Eqs. (5.4) and (5.5) 
are tedious but straightforward, and are omitted. 11 


11 The problem is to maximize F(Xj\ 1 + <4+i >•••, x "r+ \ + c t+'< x t) subject to 

v r+l s v* +1 , 

among the set of feasible programs from x. Set up the Lagrangean 


L(-) = F(*« +c‘ +1 , 


- x tT + c ?:l > x ? + X \WF c , > ' c tJ - »,*] 


. + y Pc 1 — v*' > 

T L, M r+i v m T+i’ 

i-i 

To check the constraint qualification, we construct a feasible program from x, which 
satisfies all the constraints with strict inequality. Let 

K = max [x * { , >•*',], i = 1. m, 

k = min [x* 1 , y*',], r = 0,..., T. 

By assumption that the program («*, x*, y*, c*) is regular interior, we know that 
k > 0. 

For 1 3 0 < t < T, since from x*, output is producible, so the output 
given by LPJ +1 ,...,5»^j‘] = 0#, .....yjTf 1 ) and 0 <?? + 1 < J'&i is also producible 
[by the derivative conditions on F(-)]. Also 9i +1 > 0 is producible from £,, given by 
[ii 1 ,..., S^ -1 ] = [x* 1 ,...,***- 1 ] and £ t m = x* m — c ( where 0 < «, < k/2. By taking a 
suitable convex combination (0 < A, < 1), from ?, = [x*‘. x*, m — A^ ( ], we 



44 


MAJUMDAR, MITRA AND MCFADDEN 


Suppose that a regular interior program (u*, x*, y*, c*) is short-run 
optimal and efficient, but is not long-run optimal. There there is some 
program (fi, x, y, c), such that ^ u* for all t ^ 1, strict inequality 
being valid for at least one t = 1. Let B, - Uj* = y > 0. Then, by 
Eq. (5.4) we have for all T ^ l 

qt*y' •< £ 9*(?i ~ u <*) *5 Ptu( x t+i ~ *r+i) + Pr-nC c T,i — X ^r+i) 

f-X 

Pt+iXt>i ~f Pt+ ff'r+i- (5-b) 

Since YZ.i P* c * ^ P«* x (by Ec l- ( 2 - 7 ))- there is ^ such that for a11 T ^ ^* 
Pt+i c t+i <7tV/ 4 - Note that due to the differentiability assumptions, 
the prices (/>,*) defined in Eq. (5.5) are the unique competitive prices 
associated with the program (u*, x*, y*, c*). Hence, by Theorem 2.1, 
the transversality condition Eq. (2.6) must necessarily be satisfied at 
these prices, i.e., lim,..*, p t *x * — 0. Hence, there is T* such that 


can produce >',+, given by ffL, .....y"*, 1 ] = [>?+, + v\ +i , y*?^ + vTnl where 
0 f° r all i and i > 0- Let 8, --- A,*, and 8 = min, 8,. 

Since F m m is continuous on [k/2, A], so there exists M > 0, such that F t m < M, for 
A > (x\ y‘) > k!2 for all /. 

Now, let n„ ~ 0, n, — min [S/(2A/) r ' ,1 ~', 1], t — 1. T + 1, and construct the 

required program («, at, .v, c) as follows. Ar„ = x; and for / — 0 . T, 


y - [v* y m "l y*" 1 — n ] 

V,,, - (at* 1 ,..., A-* m -‘ , AT*'" - 2u ], 

Ml 1 1+1 ’ ’ l+l ’ (+1 'H1 J ' 

r =+_ r r *i 4- r* m-1 4- T) m 1 r* m 4- u 1 

m 1+1 ~ 'm* |l m T "in ’ S+i T 

V -- [>(•*' -f ,1 .i f *i»-i j _»-t i r »™ i 1. 

1+1 1 111 'HI* Ml ~ '111 * HI ' *1+1” 

V ... |I,*I B,.*»n 1 

1+1 1 (+1 *'"* 1+1 

To check feasibility note that 

(a) (at, + , , y l+1 , V, tl , V, +1 ) > 0, t = 0,..., T, 

(b) y,u = a:, + , + c,+, ;c,+i = ‘c+i + *c l+l ; t = 0,..., T, 

(c) (x, ,y l+l )eF i = 0,..., T. 

To see this, note that since M i < 8 < 8, = A,c,, so F(y| +1 ., at," 1 ) > 0, and 

. y?+',xr)-F(y* + \ . y*~~\ at,«) =+ P r ‘m(—2fi t ) > ~&(2M)ti,!(2M) T +'-‘> 

— /*i+i • 

V • x<m) > ^+T J , 4 m ) - m» = yi" - M,+. *= y,+i.To 

cneck mat the constraints are satisfied with inequality note that (i) by the derivative 
conditions on U,(), t = 1,..., T, U,(‘c,, *c 1+l ) > u * , and (ii) *c r+ , > V* +1 . 




CLOSED MULTISECTOR MODELS 


45 


P*+i x *+i ^ <h*v'l4 for all t > T. Thus, for T > max(?, T*, T) we have 
q-*y q t *y'j2, a contradiction. Q.E.D. 

Remark. The assumptions on F rule out the cases in which the deri¬ 
vatives becomes infinite at zero values of some input or output. Ip order 
to allow for such situations, we can require the derivative conditions to 
hold only at (.v\..., >' m- \ *) :> 0 and appeal to Section III. 


References 

1. A. Bose, “Pareto Optimality and Efficient Capital Accumulation,” Discussion 
Paper No. 74-4, Department of Economics, University of Rochester, 1974. 

2. D. Cass, On capital over-accumulation in the aggregative neo-classical model of 
economic growth: A complete characterization, J. Econ. Theory 4 (1972), 200-223. 

3. D. Cass, Distinguishing inefficient competitive growth paths: A note on capital 
over-accumulation, J. Econ. Theory , 4 (1972), pp. 224-240. 

4. S. Chakravarty, "Capital and Development Planning," MIT Press, Cambridge, 
Massachusetts. 

5. R. Dorfman, P. Samuelson and R. Solow, “Linear Programming and Economic 
Analysis,” McGraw-Hill, New York, 1958. 

6. N. Dunford and J. Schwarz, “Linear Operators, Part I,” Interscience, New 
York, 1958. 

7. D. Gale, “The Theory of Linear Economic Models,” McGraw-Hill, New York., 
1958. 

8. L. Hurwicz, Programming in linear spaces, in “Studies in Linear and Non-linear 
Programming,” (K. Arrow, L. Hurwicz, and H. Uzawa, Eds.), Stanford, Calif., 
1958. 

9. S. Karlin, “Mathematical Methods in Games, Programming and Mathematical 
Economics,” Vol. I, Addison-Wesley, Reading, Mass., 1959. 

10. J. L. Kelley and I. Namioka, “Linear Topological Spaces,” Van Nostrand, 
Princeton, N.J., 1963. 

11. T. C. Kpopmans, “Three Essays on the State of Economic Science,” McGraw- 
Hill, New York, 1957. 

12. M. Kurz, Tightness and substitution in the theory of capital, J. Econ. Theory 
1 (1969), 244- 272. 

13. M. Kurz and D. Starrett, On the efficiency of competitive programs in an in¬ 
finite horizon model, Rev. Econ. Stud. 37 (1970), 233-268. 

14. D. McFadden, The evaluation of development programs. Rev. Econ. Studies 
34 (1967), 25-50. 

15. L. McKenzie, The Dorfman-Samuelson-Solow turnpike theorem. Internal. 
Econ. Rev. 4 (1963). 

16. M. Majumdar, “Efficient programs in infinite dimensional spaces: A complete 
characterization, J. Econ. Theory 7 (1974), 355-369. 

17. E. Malinvaud, Capital accumulation and efficient Allocation of resources, Econom- 
etrica 21 (1953), 233-268. 

18. T. Mitra, “Efficient Capital Accumulation in a Multi-Sectoral Neo-classical 
Model: A Direct Characterization,” Discussion Paper No. 74-1, Department of 
Economics, University of Rochester. 



46 


MAJUMDAR, MURA AND MCFADDEN 


19. M. Morbhima, ‘‘Equilibrium, Stability and Growth,” Oxford Univ. Press, London, 
1964. 

20. H. Nikaido, “Convex Structures and Economic Theory,” Academic Press, New 
York, 1968. 

21. R. Radner, Efficiency prices for infinite horizon production programs. Rev. Econ. 
Studies, 34 (1967), 51-66. 

22. R. Radner, Optimal growth in a linear logarithmic economy, Internal. Econ. 
Rev. 7 (1966), 1-33. 

23. P. A. Samuelson, An exact consumption loan model of interest with or without 
the social contrivance of money, J. Political Econ. 66 (1958). 

24. P. A. Samuelson and R. M. Solow, Balanced growth under constant returns to 
scale, Econometrica, 21 (1953). 

25. K. Shell, Applications of Pantryagin’s maximum principle to economics, in 
“Mathematical Systems: Theory and Economics," Vol. 1 (H. Kuhn and G. Szego, 
Eds.), Springer-Verlag, New York, 1969. 

26. J. Tinbergen, Optimum savings and utility maximization over time,” Econometrica 
28 (1960), 481-489. 

27. S. G. Winter, The norm of a closed technology and the straight-down-the-turnpike 
theorem. Rev. Econ. Studies 34 (1967), 67-89. 



JOURNAL OF ECONOMIC THEORY 13 , 47-57 ( 1976 ) 


A Note on the Role of the Transversality Condition in 
Signalling Capital Overaccumulation* 

Tap an Mitra 

Department of Economics, University of Illinois at Chicago Circle, Chicago, Illinois 60680 

AND 

Mukul Majumdar 

Department of Economics, Cornell University, Ithaca, New York 14853 
Received August 8, 1975 


I. Introduction 

One of the main results in [1] asserts the existence of one system of 
competitive prices supporting an efficient program relative to which the 
transversality condition is satisfied; i.e., the sequence of values of inputs 
converges to zero. In general, however, there may be more than one system 
of competitive prices associated with an efficient program and in fact, 
it is possible that the transversality condition holds for one system of 
competitive prices while it does not hold for another (see the example 
below). It is, therefore, natural to look for conditions that guarantee that 
the transversality condition is satisfied for all the competitive price 
systems supporting an efficient program. The purpose of this note is 
precisely to present such a set of conditions applicable to a fairly extensive 
class of “closed” multisector models. Besides the standard assumptions 
on technology, (like continuity, convexity, constant returns, free disposal, 
and impossibility of free production), we require that the input vectors 
of all the activities in the von Neumann facet be strictly positive. Under 
such conditions, the transversality condition is obtained for all competitive 

* The need to settle the question studied in this note was emphasized by, among 
others. Professor D. Cass in his detailed comments on our earlier paper with Professor 
D. McFadden. Our interest in the role of the transversality condition is surely due to a 
great extent to his related works. An earlier version of the paper was presented at the 
M.S.S.B. Conference held at the Dartmouth College Conference Center in June 1975. 
The present version has benefited from the helpful comments of Professors Cass, 
Shell, and McKenzie and other participants. 

47 

Copyright 1976 by Academic Press, Inc. 

AH rights of reproduction in any form reserved. 



48 


MITRA AND MAJUMDAR 


prices associated with an efficient program. Interpreted from a different 
angle, the main result shows that a competitive program violating the 
transversality condition must necessarily be inefficient. Thus, we have a 
simple and easily applicable criterion for testing the efficiency of a com¬ 
petitive program. 

While the technique of proof leading to the main result is quite inde¬ 
pendent of any result derived or discussed in [1], the interested reader 
is referred to that paper for a detailed discussion of the model and the 
related results of Cass, Malinvaud, and others in the published literature. 
It should perhaps be mentioned that in the literature on efficient and 
optimal growth, the “necessity” of the transversality condition—its role in 
signalling capital overaccumulation in competitive programs—has long 
been the subject of much discussion. 


II. Notation 

For any x — (x‘) in R'“, x is nonnegative (written x ^ 0) if x‘ > 0; it 
is semipositive (written x 0) if x > 0 and x 0; it is strictly positive 
(written x 0) if x' ■> 0 for all i. The set of all nonnegative (respectively, 
strictly positive) m-vectors is denoted by (respectively The 

m-vector w has 1 in each coordinate. The norm of x (written | x |) is chosen 
as | x | — I x< I- For any two nonzero vectors x and x\ the angular 
distance d(x , x') is given by 


d(x, x') = | x/j x | - x'/| x' | |. 

For any nonempty set Fand any vector x in R m , 

(2.1) 

d(x, F) — inf d(x, x'). 

. (2.2) 


A sequence p — ( p,) of m-vectors is nonzero if p, T 0 for at least one t. 


111. The Model 

We shall recall only the definitions used in the statement of our result, 
and the assumptions needed. As usual, T is the technology, a nonempty 
set in R ™, The pair (x, y) of m-vectors belong to T if it is possible to 
transform the input vector x into the output vector y in one period, 
where m is the number of commodities. The four standard assumption 
on T are: 



CAPITAL OVER ACCUMULATION 


49 


(A.l) S' is a closed convex cone in R % ™ (continuity, convexity, 
constant returns). 

(A.2) “(0, v) e implies "y = 0” (impossibility of free pro¬ 
duction). 

(A.3) “(x,y)er" and “x' > x, 0 < / < y" imply (x',y')eJ r 
(free disposal). 

(A.4) There is (5c, }')e J with y^> 0 (producibility). 

Our next assumption is related to the nature of the von Neumann 
equilibria associated with . It is known that (A.l) through (A.4) imply 
that there is a von Neumann equilibrium, i.e., there exist a semipositive 
input vector & > 0, a (finite) positive scalar A > 0, and a semipositive 
price vector p > 0 such that 

(£, Ax) e y, py ^ A px for all (x, y) e 5", 

A > A(x, .v) for all (x, y) e 3 r , (3.1) 

where 

A(x, y) = max[A; y > Ax; x > 0], 

Without loss of generality A is taken to be equal to 1. Since the von Neumann 
price vector is by no means unique in general, we consider all such price 
vectors. Formally, let us define 

^ = R"\ p > 0, [ p ! = l,py ^ px for all (x, y) e J'}. (3.2) 

It is trivial to verify that & is a closed convex set. Recall that the von 

Neumann-McKenzie facet F* is simply defined as the set of all activities 
breaking even at any p in the relative interior of i.e., one has (see 
[2, p, 171]) 


F* — {(x, y) e py — px} for any p in the relative interior of 3°. 

(3.3) 

It is known that F* is a closed convex cone with vertex at (0,0). Our 
next assumption requires that for any activity in F* other than (0,0), 
the input vector must be strictly positive. Formally, we have 

(A.5) for any (x, y)eF*, with (x, y) ^ (0, 0), one must have x 0. 

In particular, the vector & of von Neumann stocks defined in (3.1) must 
also satisfy ^ 0. An important consequence of (A.5) is that 


any von Neumann price vector f> is strictly positive. 


(3.4) 



50 


MTTRA AND MAJUMDAR 


If p‘ = 0 for any /, the activity (x, 0) in S with x i > 0 and x j = 0 for 
j .f /, clearly breaks even at p. Hence it belongs to F*, contradicting the 
strict positivity requirement of (A.5), and we get (3.4). 

It should be emphasized that (A.5) is somewhat restrictive. A few 
remarks relating (A.5) to some well-known conditions in the literature on 
intertemporal resource allocation will be instructive. First, going back 
to (3.1), recall that if the technology S is such that the pair (A, f>) of von 
Neumann stocks and prices satisfies the famous condition of unique 
profitability introduced by Radner [4], i.e., if one has 

py — px < 0 for all (x, y)eS with x not proportional to A, (3.5) 

the facet F* reduces to a unique ray (.£, A). Thus, if the Radner condition 
is satisfied, (A.5) requires that the unique von Neumann stock vector A be 
strictly positive. The Radner condition and the strict positivity of A 
figure prominently in the final state turnpike literature (see [3, pp. 213- 
219]). In general, a technology S satisfying (A.l) through (A.5) will by no 
means satisfy the Radner condition (see [2, p. 173]). 

Second, (A.5) does not imply the condition of output substitution 
discussed in [1], Let 

S {(x, y): x > Az , 0 < y <_ B: for z >- 0}, (3.6) 

where 



1 o- 


M h 

A m- 

1 1 

.1 1. 

B = 

1 1 
Li \\ 


Note that the von Neumann stock vector 

-H 

A = 1 
-I. 

is unique, and the facet F* consists only of the ray through (A, A), where 
P = (L L I) is a von Neumann price vector. 1 The technology, however, 
does not satisfy the condition of output substitution. This can be verified 
easily by considering 

(*. >') = 


1 Note that py < pBz = 3z, + 3z,/2 and px > pAz = 3z, + 2z, for all (z,, z.) > 0. 
Thus, py < px with equality holding for z t > 0 and z, = 0. 




CAPITAL OVERACCUMULATION 


51 


and noting that the output of the first commodity cannot be increased 
by reducing the output of the third commodity a little. 2 

A number of conditions for special classes of generalized Leontief and 
von Neumann models can be used to guarantee (A.5). The interested 
reader is referred to the extended discussions in [2, 3]. 

A feasible production program from x is a sequence (x, y) = (x t , y, +1 ) 
such that 

x 0 — x, x t < y, for all / > 1, 

(3.7) 

(x,» yt+i) e for all t > 0. 

The consumption program c = (c t ) generated by (x, y) is defined as: 

c t — y t — Xt 0) for all t > 1. (3.8) 

We refer to (x, y, c ) as a feasible program x, it being understood that (x, y) 
is a production program and c is the corresponding consumption program. 
A feasible program (x*, y*, c*) from x is efficient if there is no other 
feasible program (x, y, c) from x such that c, c t * for all t and c t > c t * 
for some t. A feasible program (x*,y*,c*) is competitive if there is a 
nonzero sequence (p t *) of nonnegative price vectors such that for all 
t > 0 one has 

0 = Pt+itt+i - P*x* > pti)’ - P*x for all (,v, y) in (3.9) 

In other words, the intertemporal profit maximization condition (3.9) is 
satisfied for all t. A competitive program (x*, y*, c*) satisfies the trans- 
versality condition if p*x* goes to zero as t goes to infinity. 


IV. The Necessity of the Transversality Condition 

We are now in a position to state and prove the main result. Under 
(A.l) through (A.5), let (x*, y*, c*) from x :> 0 be efficient and com¬ 
petitive at prices <p ( *> satisfying (3.9). Then the transversality condition 
is necessarily satisfied. Thus, the asymptotic behavior of p t *x t * (the value 
of inputs at the competitive prices) is intimately related to the question 
of inefficiency of a competitive program due to capital overaccumulation 
Recall that if the competitive prices </»,*> associated with a given feasible 

* We cannot use the first activity at all in achieving this substitution, since reduction 
of the output of the third commodity does not generate any surplus input of the first 
commodity, which is essential for using the first activity. On the other hand, using the 
second activity alone, such substitution is impossible due to fixed proportions. 



52 


M1TRA AND MAJUMDAR 


program are all strictly positive the transversality condition is sufficient 
for efficiency. On the other hand, failure of the transversality condition 
signifies inefficiency of a given competitive program. 

Theorem 4.1 . Under (A. 1 ) through (A.5), let (x*, y*, c*) be an efficient 
program from x > 0 and (p*) be a nonzero sequence of competitive prices 
satisfying for all t ^ 0 

o P^yui - Pt*x* > Pt+iy - P*x for all (x, y) e ST. (4.1) 
It follows that 

lim p*x t * = 0. (4.2) 

t-*Xi 

For a convenient organization of the proof, let us note three preliminary 
results that provide us with three key steps. 

Proposition 4. 1 . For any e > 0 there is 8 > 0 such that d[{x, y); F*) ^ e 
implies 

py -< (1-8) px. (4.3) 

Proof This, of course, is the famous value-loss lemma of the turnpike 
literature. This version is in [2, Lemma 4], Q.E.D. 

Proposition 4.2. There is a > 0 such that “(x, v) e F*, |(x, >’)l — 1” 

implies \ x | > a. 

Proof. If not, there is a sequence (x n ,y”)eF* with l(x”,y")l — 1 
and lim,,^ | x n | — 0. But (>> n ) being bounded, one has a subsequence 
(*•"', >’"') in F*, |(x"', y”')\ 1 converging to (x,y) in F*, with x = 0 
and | y \ ?- 0. This contradicts (A.2). Q.E.D. 

Proposition 4.3. There is e a = (<?„ e„) — e„u> > 0 such that 
“(*> y) € F*, |(x, y )| = 1" implies x ^ e a u. 

Proof Note that the set 

C. = {ze R + m : for some y >0, (x, y) e F*, | x | = <*} (4.4) 

is a compact subset of R m contained in R™ + . It is obviously bounded, and 
by (A.5) is contained in . To show that it is closed, take a sequence 
x " in C« converging to some x > 0. Clearly | x \ = By definition of 
C a , there is a corresponding sequence y n ^ 0, such that (x", > ,n ) e F*. 
Recall that | x" | = a, implies that (y n ) is bounded (see [4, p. 102].) 
Hence there is a convergent subsequence ( x n \y n ') tending to some 



CAPITAL OVERACCUMULATION 


53 


(x', y'). Since F* is dosed, (x\ y') e F*. Since x" converges to x, any sub¬ 
sequence x n ‘ must converge to x, so that x = x'. Thus, (x, y’) e F*, and 
| x | = a imply that x e C a , completing the proof of closedness. 

As C„ is a compact subset of , it can be covered by a finite number 
of closed balls each of which is also contained in R+ + . Hence it is obvious 
that there is some e a w ;> 0 such that x e C„ implies jc > If (x, y) e F* 
and |(x, y)\ = 1, one has | x | > « by Proposition 4.2. Note that if for 
any (x, e F*, l(x, y)| = 1 one in fact has | x | > a, we can find p > 0 
and P < 1 such that | j8x | = a. Since F* is a cone, (j8x, py) e F* so that 
pxeC ' a . Hence j8x > e a o) and x > (IIP) e a <o > e a o>. Q.E.D. 

Proof of Theorem 4.1. The difficult step in the proof is the assertion 

Lim inf | x,* | = 0. (4.5) 

Postponing the proof of (4.5) for the moment, let us see how (4.2) follows 
from (4.5). Since (4.1) is satisfied, one has for any finite T 

T 

0 < Pt*x t * = Po*x — £ Pt*c t *. (4.6) 

i-i 

T 

Since S T — Si_i P* c * is monotonically nondecreasing and bounded 
above by p 0 *x, lim r ^ M SLi Pt*c t * exists. This in turn implies from (4.6) 
that lim r ,„ p T *x T * exists, and of course, lim r ^„ p T *x T * > 0. 

Consider any von Neumann stock vector & > 0. Recall that the von 
Neumann growth factor in T is taken to be equal to 1, so that (i, £) e . 
We use (4.1) to have 

pt i* < Pt**. (4-7) 

Hence for all t ^ 0, 

p,*[(m'm x‘) w] < p ( *£ < p n *£ . (4.8) 

From (4.8) we get | p t * | < (p 0 *£/mm t £*) = A, say. Hence, we have 

0 < p*x t * < I P* I I x,* 1 < A | x,* |. (4.9) 

But the right side in (4.9) goes to 0 along a subsequence in view of (4.5). 
Hence, p t *x t * goes to 0 along a subsequence. As lim r ^« Pt*x t * exists, 
we must have limr,*, p T *x r * = 0, establishing (4.2). 

Turning to the demonstration of (4.5), our strategy is to arrive at a 
contradiction by supposing that (4.5) does not hold. If (4.5) does not 
hold, there is some a > 0 and some T' > 0 such that 


I *i* I > «' 


for all t 2* T'. 


(4.10) 



54 


MlTRA AND MAJUMDAR 


There are three main steps leading to a contradiction from (4.10). We 
take them in the following order: 

Step 1. We have to show that (4.10) implies that 


lim d[{x*, y* +1 ); F*] - 0. (4.11) 

Step 2. By using Proposition 4.3 we have to prove that (4.10) and 
(4.11) imply that there is some e = eu 0 and some T 0 ^ T‘ such that 

x,*^e^>0 for all To- (4.12) 

Step 3. Using (4.12) and (4.10), construct a feasible program ( x,y , c) 
from x > 0 such that c, ^ c* for all t and c, > c t * for at least one t. 
This means that ( x *, y*, c*) is not efficient, a contradiction that establishes 
(4.5). 

To fill in the details of Step 1, suppose that (4.11) is false. This means 
that for some « > 0 

d[(x*, _)>,*,); F*\ e for an infinite number of periods. (4.13) 

Among the first T periods, let N(T) be the number of periods in which 
(4.13) holds. One has, from (3.1) (recall that A = 1), 

£ pc,* px 4 £ fAu* - Px*-i\ ~ px T * (4.14) 

(-1 f-1 

and py,* --i, px*_, for all t, since (x* ,, y,*) e 3T. Now, for each of the 
N(T) periods in which (4.13) is supposed to hold. Proposition 4.1 can be 
applied to get 


0 -•„ 


T 


1 pc,* ^ px - N(T) S(Pxt,) - px T *. 

t-i 


(4-15) 


As p 0, and for / > 7”, | x,* | > a', by (4.10), if N(T) goes to infinity 
with T, the right-hand side is negative for a sufficiently large N(T), whereas 
the left side is always nonnegative, a contradiction establishing (4.11). 

Coming to Step 2, observe that if (x,*, y* +1 ) e F* for some t > T‘, then 
by following the argument used in Proposition 4.3 (with a replaced by a' 
in (4.4)), we get some > 0 such that x t * > *vu> for any such t. To 
extablish (4.12), therefore, one can just as well assume that (x ( *, >>* +1 ) 
does not belong to F* for any t. Choose e > 0 such that « < eJ2 where 
e„ > 0 is given by Proposition 4,3, Given this e > 0, according to (4.11), 



CAPITAL OVERACCUMULATION 


55 


there is some T 0 > T such that for each t > T 0 , there is (x t , y, +1 ) e F* 
such that 

d[(x*, y ?+»); (*t, y< + i)] < <• (4.16) 

Since F* is a cone, we can take [(*<, y >+1 )| = 1 without loss of generality 
(recalling the definition (2.1) of the angular distance used above). Now, 
(x, , >’, +1 ) e F* and \(x t , j>, +1 )| = 1 implies by Proposition 4.3, 

x t ^ e„tD 0 for all/>7),. (4.17) 

But according to (4.15) and the fact that |(x t *, .y,* j)| > 1 x t * \ > a'. 


for 

/ ^ F 0 , 

i *rvi(*«*. j&oi — 1 <« 

for all i; 

for 

t Tq , 

•*?' > IC*<*, yf+i)! (*/ - «) 

for all i ; (4.18) 

for 

t 5? F 0 , 

x,*‘ > ct'(e c - t) = e > 0 

for all i. 


This completes the proof of (4.12). 

For constructing a program (*,>*, c) that would contradict the efficiency 
of (x*, y*, c*), note that A < (max* £‘) w implies that to > f/max, A*. 
Hence for all t > T 0 , 


* ( * > eco e[£/max ^ m£, (4.19) 

where m = e/max, Since ^^>0, and from (4.13), ^.7-ipc,* <j&x, 
we have £(li | c<* | M where M is a (finite) positive number. It follows 
that there is /' ^ T 0 such that 

co 

£ 1 C t * |/min & < m/2. (4.20) 

t-r+i 1 

Setting d t * — | c ( * |/min, we can rewrite (4.19) as 

I d t * < m/2. (4.21) 

We now construct a program (&,y, c> from x 0 as follows: 

(a) x t = x t *J t * = y t *, c, = c t * for l = 1,..., /' - 1, 

(b) jy = y t -, X,- = (m/2) *, c> = y t - - x t -, 

(c) y t = *<-i, x t = [m/2 - zLt'+i 0,*] c t = y t - x t for t > /'. 



56 


MITRA AND MAJUMDAR 


In order to check that the program is feasible we proceed as follows: 

(1) Nonnegativity: Obvious for / < f' — 1. 

For t — t', note that = ye — x t - = y? — (m/2) £ = (c* + **) — 
(«n/2) £ > (c* +- mi) - (m/2) £ > c* > 0. Also, it is clear that x* > 0 
and yt > 0. For t > t', x t > 0, since 0* < m/2 (by 4,21)) and 

~ $t — i — ^ 0- 

(2) (*< ,jy (+I ) e ^ for a11 ' > 0. since this is obvious for f < /' — 1, 
and for t ^ t\ (i, ,y t ~ i) e ^ as 5, is proportional to i, and ^ is a cone 
containing (£, i). 

(3) Obviously, y t — x, + c,. Thus we see that (#, jy, c) is feasible 
from x. Finally, c, = c,* for t = 1,..., /' - 1; c t - > ry (as verified in (1) 
above); for t > t\ c t =y, - x, =r- i ( _j - *, = 0/* = [| c t * |/min, x > 
[| c,* l/min, i‘][min, £'w\ — 1 c,* | u > c ( *. 

Hence (jc *,y*, c*) is inefficient, completing the proof of Theorem 4.1. 

Q.E.D. 

Remark. It is instructive to look at an example of an efficient program 
that satisfies the transversality condition at one system of competitive 
prices, while violating it at another. Note that in this example, (A.5) 
is not satisfied. 3T — {(jc, y): Bz > y, Az y x for some z > 0}, where 



Here £ — [,*] ( — i) is a von Neumann stock vector, and f> — [0, 1] a 
von Neumann price vector. Let x = [J] be the initial stock vector. Define 
Xt* = [?], for all / > 1, >'j* = [j J ] and y* +i = [J] for all / 3s 1, c,* = [ 0 S ] 
and c* ¥l — [$] for all t > 1. This program is efficient. Two competitive 
price systems are (a) p t * — f> for all t Js 0, and (b) q 0 * =■ (£, i), 
<1* = (1, 0), q t * -- (0, 0) for all t > 2. Clearly the transversality con¬ 
dition does not hold for the price system (a), while it does for the price 
system (b). Since much has been aid about characterizing efficiency in terms 
of present value maximization, note that the efficient program of this 
example actually minimizes the value of consumption over the set of all 
feasible consumption sequences at the price system (a). 

References 

I. M. Majumdar, T. MiTra, and D. McFadden, Efficiency and Pareto optimality 
of competitive programs in closed multisector models, J. Econ. Theory 13 (1976), 
26-46. 



CAPITAL OVERACCUMULATION 


57 


2. L. McKenzie, Turnpike theorems for a generalized Leontief model, Econometrtca 
31 (1963), 165-180. 

3. H. Nikaido, “Convex Structures and Economic Theory,” Academic Press, New 
York, 1968. 

4. R. Radner, Paths of economic growth that are optimal only with regard to final 
states: A turnpike theorem, Rev. Econ. Stud. 28 (1961), 98-104. 



JOURNAL OF ECONOMIC THEORY 13, 58-66 (1976) 


Equivalence of Consumer Surplus, the Dlvisia Index of Output, 
and Eisenberg’s Addilog Social Utility* 

Trout Rader 

Department of Economics, Washington University, St. Louis, Missouri 
Received November 8, 1974; revised September 26, 1975 


1. Introduction 

The purpose of this paper is to show what sense can be made of Dupuit- 
Marshall type consumer surplus measures in the empirical literature on 
international trade, industrial organization, and public finance (e.g., 
Harberger [7-9] and Basevi [1], among many, many others). The author 
has already done this in a fashion in [13], part of which is redone in [14, 
Chap. 8, Sect. 7], Related ideas have appeared in Eisenberg [6], Chipman 
and Moore [3, 4], and Chipman [2]. However, all these theoretical works 
lack the full generality and depth that is possible when we apply the method 
of analysis in Mantel [12], 

Mantel’s approach is to represent homothetic preferences by an addilog 
utility a la Samuelson [17] and show that demand equations have an 
especially simple form. This will be adapted to consumer surplus type 
arguments. 

Relatively little of this paper is aimed at completely new results. Instead 
we bring together and simplify many more or less well-known facts. 
Some new conclusions are drawn that are just short steps beyond- the 
analysis already in print. Our main result is that a form of consumer and 
producer surplus defines an addilog social utility function whose rate of 
change is equal to the Divisia index of demand, provided that each con¬ 
sumer has homothetic preferences. 

Let d -- d(p, 6) be demand, p be the price vector, 6, income, and R 
denote “at least as good as.” Our aim is to study the compensated income 
utility function 

V t (x) ----- Min{0 | d( p, 6)Rx}. 

* The author is indebted to conversations with Rolf Mantel of Instituto Torcuato 
di Telia (Argentina) and Janies Little of Washington University and to correspondence 
with Donald Katzner of the University of California, La Jolla. 

58 

Copyright © 1976 by Academic Frew, Inc. 

All right# of reproduction in #ny form reserved. 



CONSUMER SURPLUS 


59 


Samuelson [18] calls V^x) the money metric with reference prices p, a 
term that we will use. Provided preferences are homothetic, V t is a linear 
homogeneous function. Also the Bernoulli utility function In V t is addilog 
in the sense that 

In V f (kx) = In k + In V t (x). 

This addilog form was the one used by Samuelson [17], Katzner [10], 
and Mantel [12], We show that the addilog money metric is representable 
by consumer surplus and, consequently, that the money metric is an expo¬ 
nential form of consumer surplus. 

The differential of a variable x is denoted x. 

Let/be a vector valued function of x. Then by 

r x(l> 

I f(x) dx 

4(o) 

we mean the line integral along an arc x(e), 0 < < < 1, with end points 
.v(0) and x( 1). Always we refer to the initial value of the independent 
variable as x(0) and the final value as x(I). In application, the integral 
will depend only upon the endpoints and not upon the particular arc 
chosen so that the integral will be invariant of path. Thus the notation will 
be unambiguous. Furthermore, we can write 

r xu> i> 

( f{x) dx = £ ( fix) dXi 
4<o) , 4,(0) 

without specifying whether the x,’s change in order or all together. 


2. Consumer Surplus 

Here we characterize consumer surplus. Much of the result has already 
been announced in Rader [13, 14, pp. 234-245] but the proofs there are 
very long and difficult. 

Theorem 1. Suppose preferences are homothetic and for the 
money, metric, V f , we have that d i V f ldx t is of rank n — 1 so that 
det 8 a In V „j 8 x 2 / 0 and dx/dp and dx/dd exist (or just assume 8x/8p and 
cx/c'O exist). Then 



60 


TROUT RADER 


or 


and 


*pi o x i •® <I * dd 

In V,(x) - In V,um - -jdp-r -j 


V t (x) - 


F,(*(0)) exp (- f 



ill) 

«C0) 


K, W 0))exp(-/J o) |4 


Note that the addilog utility is just the Divisia moving base index of goods 
demanded , Z>, solving the differential equation 7^/Z) = px/px. 

In Theorem 1, the money metric is measured by the exponential form of 
consumer surplus (introduced in [15; 16, pp. 234-245]). The logarithm of 
the money metric is close to ordinary consumer surplus. Indeed, changes 
in the addilog utility equal the changes in consumer surplus whenever 
nominal income, 0, is constant. (This is already an easy consequence of 
the constancy of a multiplier A =■ 1/0 [17] and of observations by Katzner 
about consumer surplus and indirect utility [II, pp. 152-154], Not only 
a related fact but a converse to that fact was recently stated by Silberberg 
[19]; For given nominal income, consumer surplus is path invariant if 
and only if preferences are homothetic.) 

Proof. Set V — V f . 

The usual demand conditions are 


where A is evaluated by 


din V 

Ox 


- A p 


A0 = Axp = 


x 


c In V 

Ox 


1 8 V 

V 8x 


= 1 


(by Euler’s theorem), 


or A — 1/0. Thus at all points x — d(p, 0) we have d In V/dx = p/6 and 
consequently 


d 3 In V dx 
ex 3 dp 




0* In V Ox 

ox 2 ee 


p_ 
02 ' 


To show Ox/Op and dxjdd exist, we need only show det d 2 In V/cx 2 =£ 0 
which follows -according to work of Dhrymes [5] and Rader [16, 
Remark 3], 



CONSUMER SURPLUS 


61 


Also, we can show 8 In Vfidx = —x(8 2 In VIex 1 ). This follows from 
taking the derivative of an equation of the previous paragraph, namely 

x(8ln VI8x) = 1. 

Now we can evaluate 


(In V) = 


8 In V 

8p 


fi + 


8 \nV 
86 


6 


8 In V 8x . 8 In V 8x * 

8x 8p ** 8x 86 


— — x - 


8 2 In V r 8x 
8x t [ 8p 




6 


% - 6 
6 2 


xf> , 6 
6 T 0 ' 


Q.E.D. 


Next we aggregate over several consumers. The money metric or 
Eisenberg social utility is defined by 

In V f = X B S I" »W) or V, = [] 

k V t 

where 6 = Y,6 k is national income. 

If the weights 6J6 are fixed, then V f is a special case of Eisenberg’s 
[6] social utility function that gives a complete social ordering as already 
shown by Eisenberg [6] and Chipman [2], In Rader [13, 14] fixed 6 k /6 is 
called “distributing the burden uniformly.” 

Theorem 2. Assume the hypothesis of Theorem 1, and assume 6 k jd 
is fixed for all k. Then 

(In IV) - dn + (j --£-)/ 0) 

and 

(In ?,) = g [0 - xP) = (2) 


M 

m 


M0>exp(j->). 


or 



62 


TROUT RADER 


In words, Theorem 2 shows that the Divisia index of demand is 
computed by taking consumer surplus. In turn, it serves as a social 
utility function so long as income distribution is unchanged. 


Proof. 


(lnl>,) = 

= U- 


_ 

t'* Qk 

_ * k h 

e 


±_p\ 

e J 


_ 0 xf> 

~e~~T' 


Q.E.D. 


We can now see what the exponential consumer surplus does. Assuming 
no redistribution of income, it measures a geometric sum of money metrics 
among consumers. Given fixed income distribution, exponential consumer 
surplus is a utility representing a complet social ordering. It is not the 
Pareto ordering because “welfares” of any two individuals may move in 
opposite directions. 

Thus, we have removed the often cited limitation to a single market or 
consumer. Consumer surplus applies equally well to the general economy. 
However, it may be that sometimes some individual’s welfare moves 
opposite to social welfare as measured by exponential consumer surplus. 
Equation (2) shows the rate of change of economy wide welfare as 
measured by consumer surplus. From it follows the fact that welfare is 
a Divisia indes of goods demanded by all consumers. Furthermore, 
Eq. (I) shows the incidence of a change for individual k relative to the 
whole economy. The difference between individual and economy wide 
welfare is entirely a matter of a comparison of the change in cost to the 
individual and the economy, respectively, of demand per unit of income. 


3. Producer Surplus 

Now we show how to compute the loss in income, 0/0. The usual reason 
for loss of welfare is that a wedge is placed between consumer prices p 
and producer prices q so that they are no longer equal or even proportional 
one to the other. Otherwise, we are on the frontier of the output possibility, 
say because factor markets are competitive [14, pp. 110-122], 



CONSUMER SURPLUS 


63 

More explicitly, the consumer i demands x* subject to px* — qy* where 
he considers qy* fixed. The producer chooses y so that y is on the smooth 
frontier of the output possibility set and y has normal q: yq — 0. Then 
the producer pays qy* to consumer /' so that £y‘ — y and the full value 
of output is distributed. At present, we do not necessarily require supply 
to equal demand {y =/= x). 

Because yq = 0, we evaluate the relative rate of change of national 
income in consumer prices as 

6 _ (£y) _ fiy + py (f - q)y 

e ~ e e e e 

Theorem 3. Under the hypothesis of Theorem 2 and assuming producer 
prices q are normal to the production possibility frontier at y, we have 



or 

pVO) m - q ,* P(D it - v 

In K,(l) = I d >’ + 'hi- d P + ln ^(0) 

J v(o) V J p(0) U 


or 


> v - (l) n - 


J>(0) Q 


"<• (? C *■ + C V'*-) 


The formula in Theorem 3 is a generalized consumer surplus—producer 
surplus formula as shown in Fig. 1 in the closed model where y = x. 



642/13/1-5 





64 


TROUT RADER 


Corollary 1 . Ifx — yasina closed economy, then 

dn ?,) = y (-^) yf° r an y ' > °)‘ 

Thus only the markets with distorted prices need be considered. The 
change (In 1 1 ) is obtained by integration. It equals the area between the 
locus of prices consumers pay and the locus of prices producers receive, 
between old and new outputs. 

An example of application of Corollary 1 is where there is a sales tax 
in a closed economy. 

Corollary 2. If p = q, x y, as in an open economy without domestic 
distortion, then 

so only markets where prices change need be considered. 

The change in In V f equals the area between the locus of home output 
and the locus of home consumption, between old and new home prices. 
An example of application of Corollary 2 is where there is a change in 
tariffs on imports but no change in world prices. Then f> is just the change 
in specific tariffs and the computation of change in social welfare need only 
be made for the goods whose tariffs change. 


4. Conclusions 

In order to use an empirically derived consumer-producer surplus, 
we must see that it is weighted in summation by the reciprocal of national 
income (2). Usually the percentage change in 6 is so small that this poses 
no problem. Also, we must check to see that income distribution does not 
change. These are the prerequisites for equating consumer-producer 
surplus with the addilog social utility function in the presence of homo- 
thetic preferences. 

A major problem is income distribution. The whole analysis here 
assumes that the distribution of income consumed is fixed. However, in 
many cases the functional distribution of income is known to be greatly 
changed by tariffs, taxes, monopoly, and the like. For instance, for two 
goods, the Stblper-Samuelson theorem shows that changes in relative 
prices change relative wages by a magnified amount. Thus the assumption 



CONSUMER SURPLUS 


65 


here must be that the income people end up with is not much dependent 
upon the functional distribution. Only with this assumption does consumer 
surplus become a sensible approach to the analysis of changes in economic 
conditions. 

There are good efficiency reasons for homothetic consumers with 
identical preferences to have the same income distribution in the end. The 
reason is that they have identical views toward time discount, prices, 
and risk and differ only in their wealth. As their wealth changes, given 
prices, their consumption changes in exact proportion. Thus the consump¬ 
tion in every state of the world of a rich consumer should be just a 
proportion of that of a poor one of the same type. Otherwise, there is 
not choice according to prices and necessarily the two could trade to a 
mutually beneficial position. 

It is only when there are different types of homothetic consumers that 
income distribution may change over time. However, this is due to differing 
attitudes toward prices, risk, and time. It seems unlikely that there would 
be any consistent trend to or from the poor except that those with high 
discount rates will gradually impoverish themselves, regardless of the 
changes in taxes, tariffs, or monopoly power [15, Chap. 6], Still, in the 
short run, changes in income distribution with changes in economic 
institutions will depend fortuitously upon the change in relative prices. 
If consumers are very similar in tastes as is postulated in so many empirical 
studies of demand, then there will be virtually no change in income distri¬ 
bution. 

What are some of the instruments by which factor owners can con¬ 
serve their proportion of income? Labor rich factor owners are forbidden 
by law to sell their labor stock to capital rich factor owners. However, 
they can agree to give special consideration to capital owners whenever 
labor is especially well favored. In return the capital rich factor owners 
can agree to give special consideration to labor owners when capital 
receives a bonanza. Presumably these arrangements could be made in 
the political arena by various welfare programs and special tax benefits. 
Among capital rich factor owners, there will be some tendency to own 
similar proportions of the vector of social nonlabor factors in order to 
hedge against unusual changes in the functional distribution of nonlabor 
factors. Evidently there are many opportunities for cushioning the effect 
of change in the functional distribution of income upon final demand. 

Even given that consumers are homothetic, one last problem remains 
with using empirical consumer surplus measures. It is one easily 
disposed of. 

Empirical studies usually quote a dollar figure for benefits or costs. 
These relate to the money metric, V f , rather than the addilog function 



66 


TROUT RADER 


In Vq . However, changes in In V f are often estimates for changes in 
V f — e Xn y >. For instance 

\ A \n V — AVIV \ < 5 % for |JlnK|<9%. 

Most studies come up with modest changes in consumer surplus so that 

there is no pressing need to go to the exponential form in order to make 

statements about dollar costs and benefits. 

References 

1. G. Basevi, The restriclive effect of the U. S. tariff and its welfare value, Amer. Econ. 
Rev. 58 (1968), 840-852, 

2. J. Chipman, Homolhetic preferences and aggregation, J. Econ. Theory 8 (1974), 
26-38. 

3. J. Chipman and J. Moore, The compensation principle in welfare economics, in 
“Papers in Quantitative Economics," Vol. II (A, M. Zarley, Ed.) pp. 1-77. Univ. 
of Kansas Press, Lawrence, Kansas, 1969. 

4. J. Chipman and J. Moore, Social utility and the gain from trade, J. Internal. 
Econ. 2 (1972), 157-172. 

5. P. Dhrymes, On a class of utility and production functions yielding everywhere 
differentiable decimal functions, Rev. Econ. Studies 34 (1967), 399-408. 

6. E. Eisfnhero, Aggregation of utility functions. Management Science 7 (1961), 
337-350. 

7. A. C. Harberger, Monopoly and resource allocation, Amer. Econ. Rev. (1954), 
77-87. 

8. A. C. Harberger, The incidence of the corporation income tax, J. Political Econ. 70 
(1962), 215-240. 

9. A. C. Harbfrger, Taxation, resource allocation, and welfare, in “The Role of 
Direct and Indirect Taxes in the Federal Revenue System,” Princeton Univ. Press, 
for the National Bureau of Economic Research, 1964. 

10. D. W. Katzner, A note on the constancy of the marginal utility of income. Internal. 
Econ. Rev. (1967), 128-130. 

11. D. W. Katzner, “Static Demand Theory," Macmillan, New York, 1970. . 

12. R. Mantel, The welfare adjustment process; its stability properties, Internar. 
Econ. Rev. 12 (1971), 415-430. 

13. T. Rader, International trade and development in a small country II, in "Papers 
in Quantitative Economics,” Vol. II (A. M. Zarley, Ed.), Univ. of Kansas Press. 
Lawrence, Kansas, 1971. 

14. T. Rader, "Theory of Microeconomics," Academic Press, New York, 1972. 

15. T. Rader, "Theory of General Economic Equilibrium," Academic Press, New York, 
1972. 

16. T. Rader, Smooth maximizers, mimeographed. 

17. P. Samuelson, Constancy of the marginal utility of income, in "Studies in Mathe¬ 
matical Economics and Econometrics,” (O. Lange, F. McIntyre, T. O, Yntena, 
Eds.), pp. 75-91. Univ. of Chicago Press, 1942. 

18. P. Samuelson, Complementarity, J. Econ. Literature 12 (1974), 1255-1289. 

19. E. Silberbercj, Duality and the many consumer surpluses, Amer. Econ. Rev. 62 
(1972), 942-952. 



JOURNAL OF ECONOMIC THEORY 13 , 67-81 (1976) 


The Precautionary Demand for Money: 

A Utility Maximization Approach 

Carol A. Taylor* 

Division of Economic and Business Research, University of Arizona, 
Tucson , Arizona 85721 

Received January 24, 1975; revised March 1, 1976 


Precautionary Demand for Money 

Uncertainty with respect to the level of expenditures required to meet 
necessities may affect an individual’s optimal money holdings. Assume 
the utility of expenditure function is a generalized Stone-Geary function, 
U(E — N), in which E is total expenditure and A is expenditure necessary 
to achieve “minimum requirements.” These minimum requirements 
are fixed in real terms, e.g., “one operating car” or “no unset broken 
bones,” but the expenditure needed to achieve this real level is a random 
variable resulting from random fluctuations in the working condition 
of necessary durable goods, random fluctuations in health, etc. Uncertainty 
about the level of N, which makes total optimal expenditure also uncertain, 
affects the portfolio decisions of an individual and in particular affects the 
optimal level of money holdings. The difference between the amount of 
money an individual holds when N is & random variable and the amount 
he holds if the expected value of N is known to occur with certainty is 
defined as the individual’s “precautionary money balance.” This paper 
examines the sign of the precautionary money balance, how it is affected 
by increases in income, and how it is affected by increases in risk. 

Formally the model is as follows: In period 1, the individual has initial 
income Y which he plans to spend in two future periods, periods 2 and 3. 
He may either hold money, M, or put his income into interest-bearing 
risk-free bonds. His decision is constrained by the following interperiod 
trading conditions. If he puts $1 into bonds in period 1, he will receive 
Sr in period 3 where r > 1. It is assumed that the time interval between 

*The author wishes to thank H. S. Shapiro, A. Deardorff, D. Heckerman, R. 
Holbrook, and S. G. Winter for their very helpful comments and suggestions on 
various versions of this paper. An earlier draft was presented at the December 1974 
Econometric Society Meetings. 

r- ■ 67 

Copyright © 1976 by Academic Press, Inc. 

AH rights of reproduction in any form reserved. 



68 


CAROL A. TAYLOR 


periods 2 and 3 is sufficiently short that the interest on a bond of shorter 
maturity between these two periods does not cover transactions and/or 
nuisance costs of conversion. Thus, the individual will never purchase 
bonds in period 2 for use in period 3. If in period 2 the individual finds 
he has more money than he wishes to spend, he may of course hold the 
extra money for use in period 3. Furthermore, if the individual in period 2 
desires more money than he has on hand, he may prematurely convert 
to cash bonds purchased earlier. It is assumed that such premature con¬ 
version involves a cost. In particular, if sufficient bonds are sold to obtain 
$1 for use in period 2, then the foregone expenditure in period 3 is not just 
Sr, but Syr where y > l. 1 

In period I in the stochastic case, the value that the random variable 
N will take on in period 2 is unknown. However, the probability density 
function of N, f(N ), is known. When period 2 arrives, the value of N for 
that period becomes known. The individual may then readjust his expendi¬ 
ture between periods 2 and 3 within the limits imposed by his initial choice 
of bonds and money and the intertemporal trading conditions. The 
expected value of N, N. is assumed to occur with certainty in period 3. 2 
If we assume that the individual’s intertemporal utility function is of the 
additive form, his time discount function is of the form t\ r < 1,* and 
he maximizes utility according to the “principle of optimality” [2], then in 
the stochastic case the individual will pick M < Y to maximize 

// - | [(/(C(M, yV) + A(M, N) - N) 

+ rU(r(Y~ M)+ M- C(M, N ) - ryA(M, N) - N]f(N)dN 

( 1 ) 

where 

C — period 2 expenditure out of money held—i.e., C < M, 
and A - period 2 expenditure out of premature asset conversion. 


1 This model is formally very similar to one used by Goldman [5]. The relation of 
this work to Goldman’s is discussed in the final section of the paper. 

* This assumption is not critical and is made only to simplify the exposition. All 
theorems presented in the text are valid if necessities expenditure in period 3 is a bound¬ 
ed, continuous random variable distributed independently of necessities expenditure 
in period 2. The two random variables need not be identically distributed. 

’ Since the primary concern of this study is behavior under uncertainty, the discount 
function is chosen to be of this form to avoid confusing adjustments made in planned 
consumption arising from acquired knowledge of the value of the uncertain necessities 
variable with adjustments made in planned consumption because prior planning 
turned out to be myopic (see Strotz [10]). 



A UTILITY MAXIMIZATION APPROACH 


69 


Fa the deterministic case, N is known to occur with certainty in period 2 
as well as period 3. Thus the individual picks M < Y to maximize 

D = U(M -N)+ rU(r(Y - M) - N). (2) 

If M g denotes the value of M which maximizes (1) and M D the value of M 
which maximizes (2), then the precautionary money balance is Mg — M D . 

The following basic assumptions are made about the utility function 
U(E — N), the random variable N, and the magnitude of income relative 
to necessities. If x — E — N, U^x) = 8U(x)l8x, U n (x) — 8 i U(x)l8x a , 
etc., and D(U) = domain of the function U, then 

A. V* e D(U), U x (x) e (0, oo) and U n (x) e (- oo, 0); (3) 

B. Vx e D(U), d(-U n (x)IU,(x))ldx < 0; 

C. Vx e D(U), di-xUnWUJxWdx > 0; 

D. N is a continuous bounded random variable with finite greatest 
lower bound and finite least upper bound, N m i n and N m&x respectively, 
with iVmin > 0; 

E. D{U) — (0, oo) and lim Il0 U^x) go; 

F. y> N + N m ax ; 

G. M d > N mux . 

3A-C are the conditions of positive but decreasing marginal utility, non¬ 
increasing absolute risk aversion, and nondecreasing relative risk aversion 
[1, 12]. Assumptions 3E-G are not critical. The proofs of the theorems 
vary slightly depending upon the domain of the utility function, the 
behavior of the marginal utility function near zero, and the size of M D 
relative to N m&x . The variation in the proofs are not sufficiently great 
to justify presentation of all cases so assumptions 3E-G are made for 
expositional brevity only. 4 


1. The Selection of M d and M r 

3E implies M D e (O, Y) and M D satisfies 

D'(M d ) = = U^Mo - N) - rrUMY -M D )-N) = 0 (4) 

oM M n 

* An appendix is available from the author which indicates the precise modifications 
required of the proofs of the theorems if assumptions 3E-G are not made. The only 
modification required of the statements of the theorems presented in the text is that the 
strong inequalities in Theorems 1-3 be changed to weak inequalities to take into account 
the trivial cases that if Mo takes on a corner maximum of 0(T) then since 0 < Mg < Y 
the precautionary money balance cannot be strictly negative (strictly positive). 



70 


CAROL A. TAYLOR 


(J u < 0 => B i Djc>M 1 < 0, and hence the second order condition for a 
maximum is satisfied. 

In the second period in the stochastic case there are three types of 
expenditure adjustments dependent upon the realized value of A and the 
amount of money held, M. 

A. N e / X (A i) iff V X {M - N) ^ r£/,(r(y - M) - N). Nel^ 
C < M, A — 0. 3E implies C > 0 and maximization requires C satisfy 
l/,(C - A) - rUi(r(Y -M) + M-C-N). 

B. Ne I 2 (M) iff U^M - A) > tUMY - M) - N) and 

U t (M — A) < riyUM T - Af) — N). (5) 

Ne I t (M) => C = M, A —0. The first order conditions imply that the 
boundary solutions C M and A — 0 are optimal. 

C. Ae /,(Af) iff U X (M - N) > rryU^Y - M) - N). Nel 3 -- 
C == M, A > 0. Maximization requires A satisfy t/,(Af + A — N) — 
rryUJri Y — Af) — ry/t — JV). Again t/ n < 0 implies the second order 
condition for a maximum is satisfied. 

From 1 and 5A-C, it is straightforward to derive that H'(M) = SHjdM 
may be written as either 

H\M) « BH/dM f (1 - r) C/x(C(A/, TV) - N)f(N)dN 

+ f - A) - rrU 1 {r(Y - M) - A)] f(N)dN 

+ f (1 - 0 ly)) U 1 {M+ A(M , AT))/(A) r/A, (6) 

or 

- 

H\M) =- BHjPM ■=■ ) [l/,(C(Af, A) -j- A(Af, A) - A) 

* ^mln 

- rrU t (,r(Y - A/) (- Af - C(Af, A) - ry,4(Af, A) - N)]f(N)dN. 

(7) 

It is tedious but straightforward to derive that BHjdM is a continuous and 
differentiable function of r, Y, y, r, and Af, and specifically, the assumption 
U n < 0 implies 8 i H/dM 2 <0. If Af — 0 then assumption 3E implies 
VAe [A m in , Anoax], N e J 3 ( 0). Since y<l and ^>0, (6) implies 
H'(0) > 0. If Af = Y, 3E implies VA e [A mln , A m&x ], A e IJY). Since 
r > 1 and U l > 0, (6) implies H'(Y) < 0. Hence the boundary points 



A UTILITY MAXIMIZATION APPROACH 


71 


M = 0 and M — Y cannot be optimal for r > 1, y > 1. Thus, the opti¬ 
mal value of money holdings in the stochastic case, M g , satisfies 
H'(M r ) = 0 and M R e (0, Y). 


2. The Sign of the Precautionary Money Balance 

Theorem 1. Given r, 3y sufficiently high that M R — M D > 0. 

That is, given the interest rate, it is always possible to find a cost of 
premature bond conversion which is high enough to insure that the 
precautionary money balance is positive. 

Proof. 3A, E, and G imply UfM D — N) is finite for ail N e [N m ta , 
Nmux]- Hence from 5C, 3y* such that I 3 (M D , y) is empty and 
A(M d , N, y*) — 0 for all Ne (Wnun , AW]. From (7) it then follows 

\ r 

i» ''max 

H'(M d , y*) - I mC(M D ,N)-N) 

* ^mln 

- rrUMY- Mo) + M d - C(M d , N) - N)]f(N)dN. 

( 8 ) 

Since C(M d , N) < M D and U n < 0, it follows that 

/* ^max _ 

H'(Mo ,y*)> UfMo - N)f(N) dN - rrl/,(r( Y - M D ) - N). 

^ml n 

(9) 

Since the assumption of nonincreasing absolute risk aversion implies 
U n] > 0, U 1 is strictly convex in N. It then follows from (9), Jensen’s 
inequality for a strictly convex function (Feller [4]), and the first order 
condition for optimality in the deterministic case that 

H'{M d , y*) > UfMp — N) — rrUMY - M D ) — N) = 0. (10) 

d i HjdM z < 0 then implies M„ > M D for y > y*. 

Theorem 2. Given r, 3y sufficiently low, but still greater than 1, such 
that M„ — M d < 0. 

That is, for any level of the interest rate, we can find a cost of premature 
bond conversion which is low enough, but still greater than 1, such that 
the precautionary demand for money is negative. 



72 


CAROL A. TAYLOR 


To avoid confusion it should be noted that a negative precautionary 
money balance in no way implies some sort of “negative money” is being 
held. It merely implies that a lesser positive amount of money is held in 
the uncertain case than in the deterministic case. If y is low, the individual 
is not particularly concerned with the possibility N may be high and he 
will want more money since the cost of converting bonds to money is 
very low. However, he is concerned about the possibility N may be low 
and he will need little money. In this event, a high level of money holdings 
implies a large amount of foregone interest. 

Proof. Since H'(M) in (6) and (7) is a continuous function of y, it 
follows that lim vU H'(M„ , y) = H'(M D , 1). From (6), it is clear that if 
y = 1, the integral over I 3 (M D ) is zero. From 5B and (6), if y = 1, the 
integral over f(M D ) is negative (or zero iff has zero probability). 

Since r > 1 and U 1 > 0, the integral over /,(A / D ) is negative (or zero 
iff l 3 (M D ) has zero probability). Since f(M D ) u / 2 ( M D ) cannot have zero 
probability for y - l. 5 H'(M D , 1) < 0. Hence lim ViU H’(M D , y) < 0 
and the continuity of H'(M) in y then implies 3y*s.t. for ally, 1 < y < y*, 
H'(M d , y*) < 0. Since d'H/DM* < 0, M„ - M D < 0 for 1 < y < y*.® 

Theorem 3. Given y, 3r sufficiently low, but still greater than 1, such 
that M k — M d 0. 


1 If y — 1, it follows from 5B and the first order condition for optimality in the 
deterministic case and U n - 0, that for all N < R, N e f,(M D ) u Hence 

WW 0 ) u I,(Md) has nonzero probability at y = 1 as long as the probability density 
function of N is nondegenerate, which it is assumed to be. 

* This proof has continually made use of the limits of the functions M D (y) and 
Muty) as y approaches I. It is important to distinguish between the limit of these 
functions as y approaches the boundary point 1 and the actual money balances the 
individual might hold if in fact y were equal to one. The continuity of the first order 
conditions with respect to y assures the existence of the limit and that the limit is an 
optimal solution at the boundary. However, even though the limit exists and is unique, 
it is not necessarily the unique optimal solution at the point y — 1. If y = 1 and hence 
premature bond conversion is costless, there is no reason not to hold all bonds in 
either the deterministic or random case—i.e., M D ~ Mk — 0. In fact at y = 1, the 
individual is indifferent among M 0 in the range [0, L 0 ] where L 0 = lim vil M D (y). 
Similarly, the individual is indifferent among M K in the range [0, L R ] where L R = 
liniyj, Mrfy). However, since the theorem in the text is concerned with behavior near 
but not necessarily at the boundary point y - - 1, it is the limit as y approaches 1 which 
is important and not actual behavior at the boundary point. 

This same type of relationship between limits and behavior at the boundary arises 
in the next theorem in which the limits as r approaches 1 of various functions are 
analyzed. If r =* 1, there is of course no reason not to have solutions M R = M D = Y. 
However, the liro rU M D (r) will satisfy U,(M D - N) = rU,(Y - M D - R). Similarly, 
Iim ril M r need not equal K, but it will satisfy H'{M R , r = 1) = 0. 



A UTILITY MAXIMIZATION APPROACH 


73 


Recalling that r is one plus the interest rate, Theorem 3 says that given the 
cost of premature bond conversion, there is a value of the interest rate 
sufficiently low, but still greater than 0, such that the precautionary demand 
for money is positive. 

Proof. Since M D is a continuous function of r and H'(M ) is a con¬ 
tinuous function of M and r, it follows lim ril H\M D {r), r) — H'(M D ( 1), 1) 
where M D ( 1) is the value of M D which satisfies U X (M D — N) — 
rlffY — M d — N). From (6), clearly the integral over I X (M D ( 1» in 
H'(M d ( 1), 1) is zero. From 5B and (6) the integral over 7 2 (M 0 (1)) in 
H'(M d ( 1), 1) is positive (or zero iff / 2 (Af 0 (l)) has zero probability). 
The assumptions y > 1, C/j > 0 imply the integral over / S (A/ D (1)) in 
H'(M d ( 1), 1) is positive (or zero iff I a (M D (\)) is empty). Since 
l 2 (M D (l)) u I 3 (\f d)( 1)) has nonzero probability, 7 lim rll H'(M D (r), r) > 0. 
It then follows from the continuity of H'(M) in M and r and the continuity 
of M d in r, 3 r* > 1 such that Vr, 1 < r < r*, H’(M D (r), r ) > 0. 

< 0 then implies M R — M D > 0 for 1 < r < r*. 

Theorems 1, 2, and 3 looked at together suggest there ought to be a 
Theorem 4 which states: Given y, 3 r sufficiently high such that 
— M d < 0. However, this is not true. 8 The basic reason it is not true 
is that the marginal interest gain associated with holding a negative 
precautionary money balance cannot be made arbitrarily large by raising r. 
This gain is (r — 1) UfC{M D , N ) — N)f(N)dN. Either or both 

of declining marginal utility (declining UfC(M D , N) ~ N)) or declining 
range of 1 X (M D ) may offset the interest rate rise. 


3. The Effect of Increases in Initial Wealth on the 
Precautionary Money Balance 

It is not difficult to obtain an expression for b(M R — Mf)jd Y, but exami¬ 
nation of the result does not permit a general determination of its sign or 
even simple conditions under which the sign will be positive or negative. 
However, it can be shown that 8(M R — M D )I8Y is not always positive or 
negative, but may take on both values for given utility function and /(TV)- 
Although we cannot show anything definite about the effect of infinitesimal 

7 From the first order condition for MdQ), 5B, and the assumption U n < 0, TV £ 
(^. V„J =- N e / a (Af 0 (l)) yj I,(M D (l)). Since the distribution of N is assumed to be 
nondegenerate, I t (M D 0)) v /»(A7 D (I)) cannot have zero probability. 

* Counterexamples are easily constructed by picking y to force I t (M D ) to be empty 
over all r. It is easy to prove by a method similar to that used in Theorem 1 that 
empty implies M s > M a . 



74 


CAROL A. TAYLOR 


increases in Y, there are some interesting effects of large increases in Y 
which may be determined. 

Theorem 4. If\m x ^ U,(x) = 0 and\im x ^J.-U n (x)IU l (x)) = 0, then 
\im Y — M 0 ) — 0 . 

Recall that - U n (x)l U^x) is the measure of absolute risk aversion which 
is assumed not to increase as x increases. Theorem 4 says that if the 
marginal utility of expenditure goes to zero and the absolute risk aversion 
goes to zero as expenditure increases, then the precautionary money 
balance also goes to zero as income increases. 

Proof. See the Appendix. 

Theorem 4 is not surprising when considered in terms of Arrow’s 
interpretation of decreasing absolute risk aversion [1 ]. Arrow demonstrates 
that as income increases an individual whose utility function is charac¬ 
terized by decreasing absolute risk aversion becomes more willing to 
accept the possibility of adverse outcomes in a fixed size simple bet. In 
Arrow’s example this is shown as a willingness to accept a higher proba¬ 
bility of an adverse outcome. In the model here, as expenditure in period 2 
increases with wealth (M D increases with K), an individual whose utility 
function is characterized by decreasing absolute risk aversion becomes 
more willing to accept the gamble on the level of necessities which exists 
in the uncertain case. In particular, he is more willing to accept adverse 
outcomes and this is shown as a decreased demand for a buffer to offset 
such events — i.e., a decreased demand for a precautionary money balance. 

Theorem 4 and its interpretation depend very critically upon the 
assumption that the size of the gamble remains constant as Y increases. 
If necessities are viewed in an absolute minimal survival sense then the 
assumption is reasonable. However, if it is the individual’s subjective 
view of what constitutes “necessities” which is relevant, then the range 
of N, N. etc., may well increase with Y. 

Let N have the form N — pY, where p is a random variable with pro¬ 
bability density function f(p) defined on a bounded set. It is then straight¬ 
forward to show that Theorem 4 is no longer valid. In particular, if the 
marginal utility function is homogeneous in E — N, then the precautionary 
money balance is a constant proportion of Y even though the absolute 
risk aversion goes to zero as Y increases. 

Returning now to the case in which f(N) remains fixed as Y increases, 
Theorem 4 implies that the precautionary money balance is an unimpor¬ 
tant dement of the overall money demand for large values of Y. This 
implication of Theorem 4 is in fact valid even if the condition of Theorem 4 
Iim^J-t/i,(x)/t/i(x)) - 0 is not satisfied. That is, if lim,^ £/,(*) = 0, 



A UTILITY MAXIMIZATION APPROACH 


75 


then the importance of the precautionary money balance relative to total 
money holdings is very small for large Y. In particular, we can prove the 
following theorem. 

Theorem 5. If lim*^ l/,(x) = 0, then — M D )jM D ) = 0. 

Proof. See the Appendix. 

Theorem 5 is intuitively more plausible than Theorem 4. Since the 
relative size of the gamble in the stochastic case declines as income 
increases, the relative adjustment made in money holdings under the 
conditions of uncertainty also declines. Again, as was the case with 
Theorem 4, Theorem 5 is not valid if necessities themselves are functions 
o f Y. 


4. The Effect of Increases in Risk on the 
Precautionary Demand for Money 

In this section an “increase in risk” is assumed to be any one of the 
three equivalent definitions presented by Rothschild and Stiglitz [8, 9], 
One of these definitions is: If X and Y are two random variables taking 
on nonzero probability on a bounded set fa, b] and E(X) = E( Y), then Y 
is riskier than A'iff E[ dZ(A')] > £[C/(y)] for all concave U. 

It can be derived from this definition and the fact that < 0 

that the effect of an increase in risk on the precautionary demand for 
money depends upon the convexity or concavity of 

UN) = UfC(M R , N) + A(M r , N) — N) 

- rrU x (r( Y - M k ) + M n - C(M R , N) - ryA(M R , N) - N), 


where / L(N)f(N)dN = 0 from the first order condition for maximization 
in the stochastic case. In particular, if L is strictly convex (concave) in N 
then an increase in risk unambiguously increases (lowers) the precautionary 
demand for money. If L contains some concave and some convex segments, 
then it is always possible to find some “increases in risk” which lower the 
precautionary demand for money and some “increases in risk” which 
raise the precautionary demand for money. 

It is straightforward, but tedious, to derive that if NelfMf), then 
UN) is strictly concave in N and if NeI t (M K ) u then L(N) is 

strictly convex in N. It follows that 

Theorem 6. It is never true that all increases in risk lower the pre¬ 
cautionary demand for money. 



76 


CAROL A. TAYLOR 


Proof. The set {N j N e lfM K ) u IfM R )} cannot have zero probability 
since this condition together with (6) would imply the contradiction 
H'(Mr) < 0. Since L(N) is strictly convex for N e IfMg) and N e IfM g ), 
there always exists an increase in risk which raises the precautionary 
demand for money. 

Theorem 6 is the strongest statement that can in general be made about 
the effect of an increase in risk on the precautionary demand for money. 
It cannot be guaranteed in general that R ) is empty. Since L(N) is 
strictly concave in N for N e h(M R ), if hi Mr) is not empty, there always 
exists an increase in risk which lowers the precautionary demand for 
money. If h( M n) ' s not empty, then there is the possibility of low enough 
N such that the individual will actually decide to carry over money from 
period 2 to period 3. Thus for these values of N the individual is losing 
interest which he could have earned if he had held a lesser amount of 
money than Mr . Some increases in risk which in particular increase the 
probability of these extreme low values of N may then tend to cause the 
individual to hold less money. On the other hand, increases in risk which 
do not affect the probability in this extreme tail of the N distribution will 
not tend to decrease the amount of money held. 

Although it is not always true that all increases in risk raise the precau¬ 
tionary demand for money, this stronger statement will be true for some 
particular cases. Specifically, 

Theorem 7. If lim^*, Ufx) = 0 and lim*,^— UuWlUfx )) — 0, 
then 3 Y* such that for all Y > Y*, all increases in risk raise the precau¬ 
tionary demand for money. 

Proof. From Lemma 3, which is proven in the Appendix, it follows 
then 3 Y* such that for Y > Y*, h(M g ) u If Mr) is empty. Hence for 
Y > Y* all N are elements of lfM R ). Since L(N) is strictly convex for 
Ns If Mr), Theorem 7 follows. 


5. Concluding Remarks 

In investigating the household demand for money under uncertainty, 
Tsiang [11] has modified the stochastic cash inflow-outflow models of 
demand for money developed for firms [e.g., 3, 6, 1] to account for the 
fact that the cash inflow of households is rarely an uncontrollable random 
variable. Essentially the model of this paper goes one step further and 
recognizes .that cash outflows of a household are not entirely random 
either. A cash outflow represents a decision to purchase a good or service 



A UTILITY MAXIMIZATION APPROACH 


77 


and the outflow may be avoided by foregoing the purchase. In this model 
the individual determines total expenditure in a utility maximization 
framework and the random variable affects the optimization decision by 
affecting the utility derived from a given level of expenditure. 

In terms of formal structure, the model is similar to one used by 
Goldman [5]. The decision framework and interperiod trading conditions 
are essentially the same. Both are concerned with forms of uncertainty 
arising from random events which affect intertemporal preferences as 
opposed to uncertainty in yields or trading conditions. Although similar 
in formal structure, the analyses and utilization of the models are very 
different in the two studies since they concentrate on very different aspects 
of behavior. <3oIdman stays entirely within the framework of uncertainty 
and considers the difference in money holdings under the assumption the 
individual behaves according to the principle of optimality and under the 
assumption he is myopic and ignores readjustments he will make as the 
uncertainty resolves itself into a particular certain event in the future. 
The analysis of this paper assumes that the individual acts according to 
the principle of optimality and considers how he behaves under uncer¬ 
tainty as opposed to certainty. 

The work in this paper suggests several extensions which should be 
investigated. Fixed transactions costs of bond-money conversion are not 
included in the model. The validity of the theorems under the discon¬ 
tinuities introduced by such costs should be considered. Another short¬ 
coming of the model is the assumption only two assets are available. 
Clearly a range of assets with different features would be more plausible. 
Finally, as indicated in footnote 2, if the level of necessities expenditure 
in period 3 is a random variable distributed independently of necessities 
in period 2, the theorems presented in the paper are still valid. A much 
more interesting case, yet to be studied, is that in which necessities in 
period 3 are stochastic, but not necessarily distributed independently of 
those in period 2. 


Appendix 

Proof of Theorems 4 and 5. Several lemmas are needed for the proof 
of Theorem 4. 

Lemma 1. If lim x ,*, Ufx) — 0 and lim*,* (—UnW/U^x)) = 0, then 
for all Y sufficiently large, l x (M D ) U I 3 ( M D ) is empty. 

Proof. From SA and 4, 7 X ( M D ) is empty iff rU^Mp — N m in) > 
UJMp — N). Since U m > 0, UWd ~ -Nmm) > U^Mp - N) - 



78 


CAROL A. TAYLOR 


(N m \n - R) V n (M d — N) and hence a sufficient condition for IfM D ) 
empty is 

1 - (AW - R)(U n (M D - N)jU 1 (M D - N)) > l/r. (11) 

The assumption linw U x (x) ^ 0 combined with (4) implies 
limy,* M d - oo. Consequently since r>l and \m x ^[~U n (x)j 
Uf jc>] - 0, (11) must be valid for sufficiently large Y. 

Using a proof along similar lines starting from the point IfM 0 ) empty 
iff - AW) < yVfM D - N), it can be shown I 3 (M D ) is empty 

for sufficienty large Y. 

Lemma 2. If lim,^ Ufx) — 0 and \im x ^( — U n (x)jU 1 (x)) = 0, then 
for all Y sufficiently large, M R — M D > 0. 

Proof. From Lemma 1, (6), Jensen’s inequality for a strictly convex 
function, and (4) it follows that for sufficiently large Y , 

H'(M d ) * J L/,(M 0 - N)f(N)dN - rrUMY ~ M D ) - N) 

> f/,(W fl - fl) - rr(/j(r(y - jt/ B ) - fl) = 0. 


So M R — M D 0. 

Lemma 3. If lim* U x (x) — 0 and lim*W - U n (x)IUfx)) — 0, then 
for all Y sufficiently large, IfM R ) kj If M R ) is empty. 

Proof. Assume Y is sufficiently large that IfM D ) u J :i ( M D ) is empty. 
Consider raising M from M D to M R * where UfM R * — AW) = 
t Ufr{Y — M R *) — N). Since IfM D ) is empty and M R * > M D , it 
follows that U X (M R * — AWO < rryVfr{ Y — M R *) — N) and hence 
IfM R *) is empty. Clearly from the choice of M R *, IfM R *) is also empty. 
Thus, from (6) and the fact U X (M R * — N) < UfM„* — AW) with 
equality holding only in the case N ■= AW, H’(M R *) < UfM R * — 
A'max) — rrUfr( Y — M R *) — N). From this inequality, the choice of 
M r *, and the fact U ni > 0, it is straightforward to derive that a sufficient 
condition for H’(M R *) < 0 is \^r-r(N min -N m&x )(U 11 (M„*~ 
N m *x)IUfM R * — A^max)). Since linv^f - U XX (M R * — N m&x )jU x (\f R * — 
A^max)) = 0, and r > 1, 3 K* such that for Y > Y*, the sufficient condition 
is satisfied and H’(M R *) < 0. The latter implies for Y > Y*. M R < M R *, 
but by the choice of M R *, M s < M„* => IfM R ) is empty. From 
Lemmas 1 and 2, for sufficiently large Y, IfM D ) is empty and M R > M D . 
But / s (Af/)) empty and M n > M D IfMf) empty. Hence for sufficiently 
large Y, IfM R ) u IfM R ) is empty. 



A UTILITY MAXIMIZATION APPROACH 


79 


Lemma 4. 3M* andK> 0 such that for all M > M* (£/„(M - JV m «)/ 
U n (M)) < K. 

Proof. From 3C and B it is straightforward to derive respectively 

U n (M-N m ax) „ M t/,(M-Mnax) 

U U (M) " M- N m >x Ui(M) 
and 

d(U y {M - N mii x)IU 1 (M)) ^ A 
8M ^ U - 

Furthermore 

0(M/(M-/V m ax)) 

cM ' U ’ 


so it follows that the lemma is clearly true for A/* = 2W max , 
K = 2[t/ 1 (A r max)/t / ' l (2A'r na x)]. 

Proof of Theorem 4. From Lemmas 2 and 3, for sufficiently large Y, 
M r > M D and li(Mfi) u I 3 (M R ) is empty so from (6), H'{M R ) — 0 => 

| U,(Mr - JV)/(V)r/iV - rr£/,(r(y - Mr) - N) = 0. (12) 


From the Mean Value Theorem, 3A/', M D < M' < M R such that 

D'(M r ) = Z>'(M 0 ) + (Mr - M d ) (13) 

where Z)'(M) = l/,(M — N) - rrUMY — M) — N) and £>'(M) = 
U n (M — N) + r*rU n (r(Y - M) - #). Since £>'(M D ) = 0, 

Ui(M r -N)- rr(/,(r( T - Mr) - N) 

- (A/ r - M c )[t/ U (M' - fl) + r z rl/ u (r( F — M') — N)]. (14) 

Expanding U 3 (M R — N ) in (11) in a Taylor series about iV retaining the 
Lagrange form of the remainder term and then combining (12) and (14), 
we get 

| _ (N - NY Uni(MR _ N * (N))f(N) dN 

= (M« - M D )[U U (M' - N) + r z Tl/ u (r(y - M') - N)], (15) 

where N*(N) is between N and N. Utilizing the observations M D < 
M‘ < M r , t/ m > 0, and N is bounded, it can be derived from (15) 
that 3T, 0 < T < oo, such that 


M r - M d 


r-u^MR-jnm 
J U u (M r - N*(N)) 


Uu(M r - N m ax) 

U n {M n ) 


f(N) (IN. 

(16) 


642/13/1.6 



80 


CAROL A. TAYLOR 


It is straightforward to derive that the assumption of nondecreasing 
relative risk aversion implies 

-U ln {M„ - N*(N)) - U n ( M R - N*(N)) M l 

U\\(Mk - N*m * L\(M r - N*(N)) ' M k — N*(N) " 

(17) 

Since N*(N) is bounded, lim, ,<* M R = co, B lim x ~J.-U u {x)IU l {x)) = 0, 
and (~t/ ul (M* • - N*(N))jU u (M R - N*(N))) > 0, (17) then implies 
lim t ^(-t/ ul (M* - N*(N))IU u (M r - N*(N))) = 0. This latter limit 
together with (16) and Lemma 4 imply lim)-_ x (M fl — M D ) = 0. 

Proof of Theorem 5. Since linv.* Ufx) — 0 and (4) together imply 
lim r ,^ M d ~ co, clearly the theorem is true for utility functions which 
satisfy the conditions of Theorem 4. 

Thus, let us assume lim x _„( — U n (x)lfJ x {x)) # 0. It then follows from 
3C that 3^ > 0, such that (— U n (x)IU 1 (x)) > a. for all x. 

Now assume M R > M n . Clearly M R < M R * where M R * satisfies 
U t (M R * — iVmax) - rU x (r( Y — M R *) — N). The latter implies all /V 
are elements of I X (M R *) and from (6), H'(M R *) < 0. Thus, M R < M R *. 
Since t/ in > 0, U x (M r *) < UfM D ) + ( M s * - M D ) U n (M„*) and it 
then follows from the choice of a that: 

-I > + <x(M r * - M D ). (18) 

Thus, if (U l (M D )IU 1 (M R *)) is bounded from above as Y -*■ oo so is 
M r * ~ M d . Since M R * > M D , it follows from the choice of M R * 
and (4) that 

. ^ Ui(Mp - N) 

" UfM R *-N mn ) 

UfMo) UfM D )jU x {M R *) 

' UfM R * - N m „) ~ UfM R * — Nmax) I UfM R *) ■ 

Lemma 4 10 then implies 3K > 0 such that for all Y sufficiently large 

_ U X ( M D )fU x {M R *) 

K 


* The first order condition for optimality in the deterministic case and the assumption 
lim._aj V t (x) =- 0 imply lim x _ w M D — oo. Lemma 2 then assures lim*_„ M R = oo. 

10 Note that in the proof of Lemma 4, the assumption of Theorem 4, lim^oc (- U n (x)l 
Ui(*)) = 0 is not used. In the proof of the lemma it is only assumed that — l/ n (x)/[/,(*) 
is nonincreasing as x increases and hence Lemma 4 is applicable here. 



A UTILITY MAXIMIZATION APPROACH 


81 


Thus U l {M D )IU l (M K *) is bounded from above as Y increases. Hence 
from (18), (M r * — M d ) is bounded from above as Y increases. Since 
Mg* > Mg , Mg — M d is bounded from above as Y increases. Mg — M D 
bounded and Iim r ^ M D — oo together imply \im Y ^J(Mg - M D )jM D ) — 0. 

If Mg < M D , then Mg > Mg* where U x (Mg* — N mln ) = 
rryU x (r(Y — Mg*) — N). For Mg*, all N are elements of I S (M S *) so 
H'(Mg*) > 0 and Mg > Mg*. By a proof analoguous to the one just 
given it can be shown Mg* — M D is bounded as Y becomes infinite so 
Mg — M d is bounded as Y becomes infinite and lim r , tt ((M R — M D )I 
M d ) = 0. 


References 

1. K. J. Arrow, "Essays in the Theory of Risk-Bearing,” Markham, Chicago, 1971. 

2. R. Bellman, “Dynamic Programming,” Princeton Univ. Press, Princeton, N. J., 
1957. 

3. G. D. Eppen and E. F. Fama, Cash balance and simple dynamic portfolio problems 
with proportional costs. Internal. Econ. Rev. 10 (1969), 119-133. 

4. W. Feller, “An Introduction to Probability Theory and Its Applications,” Vol. II, 
Wiley, New York, 1966. 

5. S. M. Goldman, Flexibility and the demand for money, J. Econ. Theory 9 (1974), 
203-222. 

6. M. H. Miller and D. Orr, A model of the demand for money by firms, Quarterly 
J. Econ. 80 (1966), 413-435. 

7. D. Orr, “Cash Management and the Demand for Money,” Praeger, New York, 
1971. 

8. M. Rothschild and J. E. Stigutz, Increasing risk: 1: A definition, J. Econ. 
Theory 2 (1970), 225-243. 

9. M. Rothschild and J. E. Stiglitz, Increasing risk II: Its economic consequences, 
J. Econ. Theory 3 (1971), 66-84. 

10. R. H. Strotz, Myopia and inconsistency in dynamic utility maximization, Rev. 
Econ. Studies 23 (1955-56), 165-180. 

11. S. C. Tsiang, The precautionary demand for money: An inventory theoretical 
analysis, J. Political Econ. 77 (1969), 99-117 

12. S. C. TsiANCi, The rationale of the mean-standard deviation analysis, skewness 
preference, and the demand for money, Amer. Econ. Rev. 62 (1972), 354-371. 



JOURNAL OF ECONOMIC THEORY 13 , 82-1 11 (1976) 


Unequal Inequalities. II 

Serge-Christophe Kolm 


School jor Higher Studies in the Social Sciences , and CEP REMAP, Paris , France 
Received March 7, 1975; revised December 2, 1975 


Summaries 

This paper analyzes properties of measures of inequality, applied to 
income inequalities but meaningful for practically any measure of dis¬ 
persion in economics. We call n the number of persons, i the person’s 
index, x, person »’s income, x == £ (*</») the average income, x the vector 
of the x,’s or income distribution, l(x) a real-valued function of x which 
is the measure (or index) of inequality. 

Part I (Sects. f~V), which appeared in the last issue of this journal, 
analyzed several structures or properties, and specific forms, of /. We 
distinguished several 7’s: the measures of inequality per person (or 
"absolute”) /", per pound (or “relative") 1' = J a lx, and total n/“. We 
presented several possible properties of inequality measures, such as: 
I - 0 if all x,’s are equal (“zero at equality”), I > 0 otherwise (“positivity 
out of equality”), symmetry of / for x (“impartiality”), ((dl/Px,) — 
(dljdx s ))(x, — x,) > 0 for x, ^ x, (“rectifiance” of the function /, or 
“transfers principle," this being the strict form whereas the weak one is 
with sign >), the fact that 

(8(x — l^lcxj) 

~(d(x — l a )/8Xj) 

does not depend upon for k i,j (“welfare independence," or, for 
short, "independence”). Rectifiance plus symmetry is Schur-convexity. 
Independence plus symmetry plus zero at equality implies that 
x =ri x — 7° = <?(-*<)] where x is the “equal equivalent 

income”; and we will show that, these three properties being satisfied, the 
following ones are equivalent to each other; positivity out of equality, 
rectifiance, quasi-convexity, (p’s concavity. 

Part I largely focused on the study of six related specific measures of 
inequality, which in particular possess all the above properties: e, a , and £ 
being positive parameters, they are 

r _ *»i/i —t 

/c a = *+£-[(l/n)£(*, + £)>-'] 

82 

Copyright C 1976 by Academic Press, Inc. 

All rights of reproduction in any form reserved. 



UNEQUAL INEQUALITIES 


83 


or 

IS = * + £ - n{x ( + 

// = /,*/* /, = // for |=0, / r « = */ r -=■/,• for { = 0, 

/, = (!/«) Log [(!/„) £*■■<—■>] 

and / t r = /,/*. Lower indices c, r, / respectively stand for “centrist,” 
“rightist,” and “leftist” measures of inequality. I r and I, are invariant 
under respectively equiproportional variation in, or equal addition to, 
all incomes; measures which have the first of these two properties are 
said to be “intensive.” 

We now consider different and more general measures, and other 
properties. We first reconcile the last two properties by dropping the 
“independence” one (Section VI). Then, we analyze another mildly 
equalitarian property, the “principle of diminishing transfers” (Sec¬ 
tion VII). Section VIII turns to the relations between inequality measures 
and Lorenz and concentration curves. We then consider the effect on 
inequality of additions of incomes, and we analyze the properties of 
“diminishing equality” (Section IX). The effect of unions of populations 
is the topic of Section X. Finally, the last section (XI) presents the more 
general relations between the various structural properties of inequality 
measures. 1 


VI. Synthetic Measures of Inequality 
VI. a. General Form 

We started by considering two properties: an inequality per pound which 
is invariant when all incomes are multiplied by the same number, and an 
inequality per person which is invariant when the same amount is added 
to all incomes. Is it not possible to find a measure which satisfies both, 
i.e., which encompasses all at a time the “rightist” and the “leftist” 
position instead of being a “centrist” compromise which may betray both 
of them? True, the measures I r and I t (and 7 r ° and 7, r ) we found in answer 
to these two basic requirements differed from each other. But to obtain 
them we added further conditions and most notably the “independence” 
condition. If we accept dropping the latter for the sake of reconciling the 
two first requirements, we may find a solution to this problem. 

1 A number of this paper’s results were already presented—but without the proofs— 
at the 1966 Biarritz conference of the International Economic Association on Public 
Economics ([1, Sects. VI, VII]). 



84 


SERGE-CHRISTOPHE KOLM 


In fact, we know that there exists at least one such measure: good old 
standard deviation of incomes a taken as a measure of inequality per 
person, and its companion the coefficient of variation ajx for the corre¬ 
sponding measure of inequality per pound, since a = [(lfn) £ (*; - ff)*] 1/a 
is invariant when the same amount is added to all * 4 ’s, and ajx = 
[U/«)L ((*./*) ~ 1) 2 ] 1/J is invariant when all at/s are multiplied by the 
same scalar A. They also satisfy the requirements of being zero if all at/s 
are equal and positive otherwise, and of symmetry. And if we transfer 
one penny from person i to person j, x does not change and £ (at, — x) 2 
is increased by 2(at ; — *,), which is a decrease if < x ( , and a and ajx 
vary in the same direction: both these measures thus satisfy the “transfer 
principle," i.e., they are “rectifiant,” and also Schur-convex since they 
are symmetrical. And if we change variables from at, into xjx, the average 
of which is 1, this property for a shows that it also holds for ajx as function 
of the x ( jx's. Therefore, if a distribution has its Lorenz curve uniformly 
above that of another one, it also has a smaller ajx (and thus a smaller a 
if the two distributions have the same average and total incomes). 

Let us now find the most general form of inequality measure which 
satisfies the required properties. We now call /(at) the measure of per 
person (“absolute”) inequality. Per pound (“relative”) inequality is 
ljx. We assume / — 0 when all incomes are equal. We recall that average 
income x is transformed into * + jx when number jx is added to all x.’s, 
and into A* when all at,’s are multiplied by number A. e is the n-vector 
each coordinate of which is 1. Constancy of inequality per person when 
everyone receives the same amount /x is 

/(at 4 jxe) — /( at ). 


Constancy of inequality per pound when all incomes are multiplied by 
the same number A, which we assume is positive, implies 

/(Aat) = A ■ /(at). 

These two properties imply 

/[A ' (x + jue)] = A • /(.v), 

or, calling v — A jx, 

/(Aa? 4- ve) = A • /(at), 

which contains each of them as special cases (A = 1 and v = 0), and is 
thus equivalent to the set of these two conditions. 



UNEQUAL INEQUALITIES 


85 


Now, we can put /a = —* and, when all x ( 's are not equal and thus 
a 0, A = 1/ct. This transforms the condition into 

I(x) = a • I((x - xe)/a), 

that is, //a is a function of the “reduced income discrepancies” (x< — x)/o. 
Conversely, given any function of n variables F, 

/(*) = cr • F[{(*i ~ *)M] 

satisfies I(x + /xe) = I(x) and /(A*) = A • I(x). This form is thus the most 
general one satisfying the two conditions [1, Theorem 10], 

The other properties required from an inequality measure I(x) impose 
properties of function F. Clearly, I(x) is symmetrical, positive when not 
all *j’s are equal, zero when all x,’s are equal if, and only if, respectively, 
F is symmetrical, F is positive when all its arguments are not equal, and 
oF tends to zero when all *,’s tend to be equal (i.e., to x). If F is linearly 
homogeneous, / = F[{x, — *}], and therefore / is zero at equality or 
Schur-convex if and only if F is respectively zero when all its arguments 
are, or Schur-convex (since a transfer does not change x and the x ( — x’s 
are classified as the x,’s). Of course, if F is one, / is a\ if it is a constant, 
/ is proportional to a; if it is a standard deviation or the square root of 
an arithmetic average of squares, F is one and / is cr; if it is any function of 
such a form, it is constant and / is proportional to a; we finally notice that 
when there are only two persons, 1 and 2, cr — | x x — x 2 |/2, (x, — x)/a — 
sgn(x, — Xj) and, with symmetrical F, l(x) — k • \ — x 2 | where k is 

a positive constant. 

VI.b. Inconveniences of “Independence" 

All the measures I(x) found violate the “welfare independence” 
property, since, if it were not so, all conditions but I(x +/xe) = I(x) 
would give xl r , and all conditions but /(Ax) = A • I(x) would give /,, 
whereas these two functions are inconsistent with each other (when they 
are not identically zero). We may for instance check that, k being any 
positive constant, 

d(x — ko)jdx i _ cr — k(x ( — x) 

8(x — ka)ldXj cr — k{x } — x) 

depends upon x t for / neither i nor j by the intermediary of x and a. 
There thus exists no functions and <p such that x — ka = 0[£ <?(*<)]• 
Therefore, if there is “welfare independence,” a (or ko) cannot be /== x — x 
where x is the equal equivalent. But may we then use it instead of I, 
along with x, to classify distributions by an ordinal index U(x, a) more 
general than x — ko or increasing functions of this expression? There 



86 


serge-christophe kolm 


would thus exist functions V, and 9 such that the welfare mdex is 
m o) <*>[£ <p( Xj )]. This can happen if and only if 9 is a quadratic 
function which can be checked as is done in the theory of choice under 
uncertainty for a similar result (the formal difference is the existence of 
probabilities in the latter case). Now, an equal increase in all *<’s gives 
t 0 l - x — x the marginal variation /' = 1 — 2 (d*/d*<). But <p(x) = 
(I /n)£ <p(x,) by definition, and thus <p'(x) • ( dx/dx ,) = (1 jn) 9 (#<), from 
which 

<p'(*) ’ £ (0*/«*i) = (1/n) £ ?'(*<) = <P'(*) 

where the last equality holds because 9 ' is linear because 9 is quadratic. 
9 ' is constant, and it has to be negative if x > x (i.e., / > 0 ) out of 
equality if we take 9 increasing, since this is equivalent to strict concavity 
of 9 , i.e., to <p(x) > ( 1 /«)!>(*,) = ?>(*)• Thus - 9 must be a decreasing 
function, and x > x out of equality implies <p’(x) < cp’(x), and 
Y((’x/dx,) < 1 , and finally l > 0 : an equal increase in all incomes 
increases per person inequality . 2 This is an “ultra-leftist” position, which 
can be objected to. But it also requires the “independence” property. 
Why not rather drop the latter? If we do that, we know that a is from this 
respect a valid companion to x to classify distributions, since it is even 
much more: a satisfactory absolute measure of inequality per person. 

Standard deviation and coefficient of variation have also been criticized, 
as measures of inequality, because they give the same weight to incomes 
symmetrically distributed around the mean (i.e., x t and x, such that 
x, — x — x — x,) whereas one is larger than the other. But they do not 
give the same importance to variations in such incomes since we have 
noticed that a small transfer from a richer person to a poorer one decreases 
o and ajx. This decrease even appeared to be proportional to the difference 
of these two incomes; but this opens the way to another possible criticism 
of these measures, which is the topic of the next section. 

* But it does not necessarily increase the per pound (“relative") inequality I/s'. This 
increases or decreases according as the relative variation in x is smaller or larger than 
that of X, i.e., as X • <p\S) % * • f\x). Since f’ is linear, decreasing, and positive, this 
expression is of the form (a — Xjx ^ (a — x)x with a > 0 and x, < a for all i’s (which 
implies 0 < x < x < a). But if we choose all x,’s between 0 and a/2, we also have 
0 < x < X < a/2 and the sign > holds in the inequality, whereas if we choose all 
Xi s between a/2 and a, we also have a/2 < x < X < a and the sign < holds in the 
inequality. This notwithstanding the fact that both [l]’s “marginal injustice” and 
“relative marginal injustice” (-?’/?' -= ]/(a - y) and —yy/y' = y/(a - y)) are 
increasing with y — A. Atkinson [2] calls them “inequality-aversion” and “relative 
inequality-aversion” by analogy with risk theory vocabulary in English (the “risk- 
aversion measure was called “prudence” in French) and suggests that “while the 
objections to this property are less strong than the corresponding objections in the 
uncertainty case, it may be grounds for rejecting the quadratic.” 



UNEQUAL INEQUALITIES 


87 


VII. The Principle of Diminishing Transfers 

A mild equalitarian will certainly appreciate a small transfer from a 
richer person to a poorer one (“transfers principle,” Schur-concavity of 
the evaluation function, “rectifiance," “isophily”). But he may go one 
step further and value more such a transfer between persons with given 
income difference if these incomes are lower than if they are higher. 
Thus, he would prefer to transfer one pound from a person who earns 
500 pounds a month to another one who earns only 100, than to transfer 
one pound from a 900 pounds earner to a person who already earns 500 
pounds. None of these operations changes total social nor average income. 
Thus, their effect on an evaluation function, which shows on that speci¬ 
fication of it, the equal equivalent x, appears with reverse ordering on 
inequality measures x — x and (x — *)/*. Extending H. Dalton’s 
vocabulary, we may call this property the “principle of diminishing 
transfers.” As the “transfers principle,” this concept is an ordinal one 
since it is defined by a classification of differences between derivatives of 
the index for each given distribution (if J(x ) is an index with J ( — 8J/8x t , 
the inequality J t — J, > J k — J t does not change when J is transformed 
into F(J) where F is any increasing function . 8 

We have noticed that the effect of a marginal transfer from person i to 
person j on standard deviation a is proportional to x< — x,. For a given 
discrepancy between these two incomes, it does not depend upon their 
level. As a measure of inequality, a thus violates the above “principle.” 
And so does any function of <r and x, such as ko, the coefficient of variation 
a lx, the variance a 2 , <r 2 /x 2 , a 2 /x and any of these multiplied by k. 

A. Atkinson [2] pointed out this property of the coefficient of variation, 
and suggested that it could be a shortcoming of this measure. But, how 
fare, in this respect, the other measures mentioned in the precedent 
sections ? 

For “welfare independent” and “impartial” measures, of the form 
x — x or (x — x)/x with <p(x) — (\/n) £ <p(x ; ) (cf. Section XI.d. below), 
we consider the equivalent property on the equal equivalent income x. 
For x { , xj , x k , x ( such that x t < , x k < x lt x } — x ( — x t — x k , 

x t < x k , Xj <x t , we want to know whether (dx/dxf) — (dx/dXj) Jg 
(8x/dx k ) — (dxjdxi). From the definition <p(x) = (1/n) X <p( x i) and thus 
<p'(x) ■ (8x/8x t ) — ( 1 /n) (p'(x { ), this is equivalent to <?'(*<) — <p'(*y) 


* A still more egalitarian concept would be a "principle of relatively diminishing 
transfers” saying that a small transfer is more equalizing from j to i than from l to k 
(i.e., it decreases inequality more, or, since X remains unchanged, it is preferred) when 
X i/Xi = xilx k and x t < x ,, x k < x t , x t < x t , x, < x ,. It also is an ordinal concept. 



88 


SERGE-CHRISTOPHE KOLM 


9 >( Xk ) _ 9 '(*,) if v '(x) is positive and the reverse inequalities if it is 
negative This is in turn a way of writing that <p' is convex or concave, 
another way being, if <p” exists, <p m ^ 0 (almost everywhere for the 
strict inequalities). And (8x/8 Xi ) > 0 for all possible income levels 
(for (dx/8x { ) # 0) imposes that <p' has the same sign everywhere. Therefore, 
the “principle of diminishing transfers,” or its opposite, is true, according 
as <p' and <p " have the same or opposite signs. And since <p' and <p" have 
opposite signs (seen to be necessary for x > * out of equality), the 
condition is that <p\ <p" and <p'" alternate in signs or not. 

Now, for <p(y) — (y + £) 1_c , including the special case where £ = 0, 
<p\ <p\ and <p" have the respective signs of 1 — e, (1 — e)(— e), 
(1 ,„ e )(_ e )(— e _ l). They alternate if and only if e > 0, which is 
required by the condition that <p' and <p" alone differ in sign. These deri¬ 
vatives also alternate in sign for <p(y) — Log(y + £), including the special 
case £ — 0. And for <p(y) = e- av , the derivative alternate in sign if and 
only if a > 0, i.e., if <p’ and <p“ alone differ in sign. Therefore, all inequality 
measures l r , /,, I c , 7,°, / 8 r , // satisfy the principle of diminishing 
transfers. 

The effect of a transfer on the measures derived from a still poses 
another problem. We have do/dx, — (x t — x)jnc and 

(i(o/x) ] i x ( — x ff\ 

^.v, nx\ cr x) ’ 

which shows the Schur-convexity, and (do/d.v,) — (do/dx,) = (x ( — x^/no 
and 

d(ojx) d(o/x) _ x, — Xj 

dx i Ox, iujx 

which shows the proportionality to the difference of incomes. We Then 
see that a small transfer from j to i (x, < at,) decreases a and a/x more 
when cr is smaller, for given and x s (and x for o/x)\ i.e., the transfer 
decreases inequality more when inequality is smaller. But the effects of 
this transfer on variance o 2 (unchanged by an equal variation in all 
incomes), or o 2 /x 2 (unchanged by an equiproportional variation in all 
incomes), and on o 2 /x (proportional to an equiproportional variation in 
all incomes) are respectively ( 2 /n)(x { - Xj \ ( 2 /nx 2 )(x t - Xj ) and 
(2 lnx)( Xi Xj): none depends upon the inequality measure. However, 
the last two and the effect on o/x are smaller when x is larger; we may 
find this objectionable, because then the fixed Xj and x, become in some 
sense smaller relative to the rest of the distribution. 



UNEQUAL INEQUALITIES 


89 


VIII. Inequality Measures and Lorenz and Concentration Curves 

We mentioned several times the equivalence between the “transfers 1 
principle” and smaller inequality for distributions of the same total 
whose Lorenz curve is everywhere above. The transfers principle itself 
only compares distributions of the same total amount. What can be said 
for distributions which do not have the same total and average amount ? 4 

We shall have to consider the “concentration curve” of a distribution. 
By this name we call the graph of the sum of the m smallest incomes as a 
function of m. More precisely, * = {*,} (i = 1,...,«) being an income 
distribution, we reorder the x.’s into the x- ’s with x,' < x 2 ' < < x„' 

and each x/ is an x, , 5 We then call y f = £(_i x/. The concentation curve 
is y } as a function of j. We obviously have y„ = £ = X, and 

Vj — Min, x ( . 

We also define ij,- = _y</Tn — yJX. Of course, ■>?„ = 1. The Lorenz 
curve is obtained by plotting against the figure ijn. We call x' — {*/}, 
y = {>’,}, and 1) — {t),} = yfX the n-vectors of the x,'’s, t»” s and t) ( ’s. 
By the relation > between two vectors of same dimension, we mean > 
for each coordinate and > for at least one. Superscript 1 and 2 will refer 
to two distributions which we compare, y 1 > y 2 means that x l ’s concen¬ 
tration curve is “nowhere under and somewhere above” x^s. tj 1 > rf 
means that x v s Lorenz curve is “nowhere under and somewhere 
above” x 2 ’s.® 

The following relations hold. Of course, x 1 > x 2 or y 1 > y 2 implies 
A ' 1 > A' 2 . And if X 1 = X 2 , y 1 ^ y 2 and rj 1 ^ rf imply each other. Also, 
r) 1 > V 2 and X 1 > A " 2 imply y 1 > y 2 (but y 1 > y 2 does not imply rj 1 > rj 2 ). 
y‘ = j' 2 , x' 1 ~ x" 2 , and x 1 and x 2 are a permutation of each other (i.e., 
their coordinates are), are equivalent properties, x ' 1 ^ x ’ 2 implies y 1 > y 2 , 
and x l > x 2 implies x ' 1 ^ x ' 2 (see, for instance, [3, pp. 108-109]) and 
y 1 > y 2 . 

Writing y(x) and tj(x) for the vector functions by which y and 77 are 
derived from x, one has 7 j(x) —- y(x)/X = y(x/X) since each v, is a linear 
homogeneous function of the x,’s. This shows that for intensive inequality 
measures (i.e., /(Ax) = I(x) for all admissible A’s and x’s), relations with 
Lorenz or concentration curves are equivalent. Among these measures 
are I T and ajx, and / r is the only “welfare independent” one which has this 

* Cf. [1, Theorems 1-6J. 

* i as a function of x,' thus is the number of persons whose income is not larger than 
x/. ijn as function of x ( ' is therefore the cumulative distribution function of the x,’s. 

‘ In [1, Sect. VI], a preference for a higher concentration curve is called “isophily,” 
and a preference for a higher Lorenz curve for distributions of the same amount is 
called “constant-sum isophily” (“isophile" = who likes equality). 



90 


SERGE-CHRISTOPHE KOLM 


property. For other inequality measures, the relation between their 
relations with Lorenz and concentration curves depends upon the effect 
on them of a multiplication of * by a scalar (even though X is a special 
one). Results of Part I Section V, show that all the other measures which 
have been considered vary in the same direction as such an equipropor- 
tional variation in all x.’s, with the exceptions there mentioned: we shall 
say that inequality is subintensive when /(A*) 3 * I(x) depending on 
A ^ 1 for all admissible nonequal x/s and A’s (“sub” is here because this 
property may be considered as more moderate than intensiveness). 

We call again V(x) an ordinal, differentiable, increasing (“benevolence”), 
strictly Schur-concave (“impartiality”-symmetry and “rectifiance”-“trans- 
fers principle”) evaluation function, x defined by V(x) — V(xe) the equal 
equivalent income, and /“ = x - x and I T = 1 — (x/x) the inequalities 
per person and per pound. These inequality measures are zero when all 
x,'s are equal (x = xe = xe). Obviously, Schur-concavity of V and x 
and Schur-convexity of /“ and 1 T are all equivalent properties. They are 
both symmetry of these four functions of x and the “rectifiance” con¬ 
ditions: x, < x, implies cVjbx, > BV/dxj , £xl 8x,- > dxjdxj , 81/8x, < dl/dxj 
for / being /“ or l r (we now consider only the “strict" forms). If V is 
“independent,” i.e., if V — <p(x ( )], these latter conditions are equiv¬ 

alent to strict concavity of cp if it is increasing (d> increasing), to its strict 
convexity if it is decreasing (<P decreasing). We recall that all the specific 
inequality measures previously considered are Schur-convex (we exclude 
the trivial identically zero cases). 

For constant-sum comparisons, i.e., comparisons between x l and x 2 
such that X 1 — X 2 , v 1 > )’ 2 and y > y are equivalent, and since x l = x 2 , 
I a and I r vary in the same direction (I will be /" or 7 r ) and V in the opposite 
one. Then, > 17 2 and the transfers principle are equivalent in the 
following sense. / is Schur-convex if and only if /(x 1 ) < /(x*) for all x 1 
and x* such that y > y; and rj l > 17 2 if and only if /(x 1 ) < I(jc 2 ) for all 
Schur-convex I. 1 

We consider now the more general case where X 1 > X 2 and x 1 > x 2 . 
This inequality is implied by y 1 5? y 2 . We now have the more general 

’This relation contains three propositions: (1°) /(x 1 ) < /(x*) if 7 is Schur-convex 
and V > ij\ (2°) 7 is Schur-convex if 7(x’) < I(x') for all x' and x* such that y > 17*, 
(3 ) y > y if 7(x l ) < 7(x*) for all Schur-convex /. The two first ones result from 
Ostrowski’s characterization of strict Schur-convexity by l(Bx) < 7(x) for all bi¬ 
stochastic matrix B when Bx is not a permutation of x [4], and from the equivalence 
between y > 17* and altogether there exists a bistochastic matrix B such that x 1 = Bx * 
(a direct result from Hardy-Littlewood-Pdlya’s Theorem 46 [5]) and x 1 is not a per¬ 
mutation of x*. The third one results from an easy to prove strict form of Karamata’s 
inequality [6] by considering the Schur-convex functions 7 of the form £ W(x,) with 
convex 



UNEQUAL INEQUALITIES 


91 


result that y 1 > y 2 and the transfers principle are equivalent in the 
following sense. / is Schur-convex if and only if F(x*) > F(x*) for all x 1 
and x 1 such that y 1 > y 2 ', and y l > y 2 if and only if V(x l ) > F(x®) for all 
Schur-convex l (and thus Schur-concave F). 

Let us prove these latter results. If F(x‘) > F(x*) for all x 1 and x 2 such 
that v 1 > y i , it is in particular so for all x 1 and x® for which, in addition, 
X 1 — X 2 , and 1 is thus Schur-convex (F Schur-concave) from a previous 
result. The property V(x l ) > F(x 2 ) if F is Schur-concave and y 1 > >>® 
is Ostrowski’s Theorem V of [4]. If F(x*) > F(x®) for all (strictly) Schur- 
concave F, by continuity Fix 1 ) 5 * F(x*) for all weakly Schur-concave F 
(i.e., F such that x, < x s implies F, 5s F,); but y t is a weakly Schur- 
concave function of x; thus, yf 5= yf for all /'; and yf = y, 2 for all j 
would imply x ' 1 = x'® and thus F(x l ) = F(x®) for a (strictly) Schur- 
concave F; therefore y 1 ^ y 2 . 

These are results about concentration curves’ “dominance.” For Lorenz 
curves and non-constant-sum comparisons, the following results hold. 

(1°) If if- 5 s i] 2 and X 1 > X 2 , F(x’) > F(x®) for all Schur-concave V's. 

(2°) For intensive Schur-convex inequality measures, /(x 1 ) < /(x®) if 
r l' > V 2 - 

(3°) rj 1 < rf and X 1 > X 2 imply /(x 1 ) > /(x®) for subintensive 
Schur-convex inequality measures. 

We note that, among the specific inequality measures considered above, 
/, and cr/x are intensive, and all others are subintensive (except 7 C for 
£ < 0 ; and I r is the only “welfare independent” intensive measure). 

Property (1°) holds because 75 1 :> 7 f and X 1 5= X ® imply y l 5 s > ,z . 
Property (2°) holds because, for inequality measures such that /(Ax) = /(x), 
and noting that tj is the y of x/Af, 

/(x 1 ) = l(x l jX 1 ) < I(x 2 jX ®) - /(x®). 

Property (3°) holds because, for inequality measures such that /(Ax) ^ /(x) 
out of equality depending on A 1, and since tj 1 ^ 77 ® is identical to 
r 1 Af®/A ' 1 < y® whereas y l X 2 /X 1 is the y of x 1 X 2 jX 1 , 

/(x®) < l(x l X 2 /X 1 ) ^ /(x 1 ). 

IX. Additions of Incomes and Diminishing Equality 
IX.a. General Properties 

When we add incomes of several kinds, how does inequality in the 
global income depend upon the inequalities of the various components? 
This certainly is a useful question. The “incomes” added could for instance 



92 


SERGE-CHRISTOPHE KOLM 


be: earned and unearned incomes, or yearly increments in incomes 
(which shows what inequality variation is induced by the growth of 
incomes), or private incomes and government transfers (plus, possibly, 
the value to persons of free government services), or after tax incomes and 
taxes (the sum of which gives before tax incomes, so that we relate taxes’ 
effect on income inequality to fiscal inequality), etc. Or we may want to 
consider inequalities in wealth and in various kinds of wealth (nonhuman 
and human, etc.), or to relate the variations of inequality in wealth 
holding with the inequality in net savings (wealth increments), etc. Note 
that our “addition” would be the statisticians’ “composition” of income 
distributions. We shall obtain the property that, for some per person or 
total measures, inequality in the sum is less than the sum of inequalities; 
we shall call this the subadditivity property. And, for these measures, 
growth of inequality is less than the inequality in growth. For the corre¬ 
sponding per pound measures, these relations hold when the inequalities 
are weighted by average or total incomes (i.e., a weight is one of these 
incomes divided by their sum); we shall call this the weighted subadditivity 
property. 

When we consider weighted sums of incomes, rather than unweighted 
ones, we are introduced to the related properties of “convexity." This 
answers questions such as the following. Suppose we progressively, 
regularly, and proportionally bridge the gap from some income distri¬ 
bution to a more equally distributed one. Will we meet some sort of 
diminishing returns so that the decrease in inequality index is slower and 
slower? In mathematical terms, this property would be the convexity 
of the inequality index I(x) in the set of all x/s. Since / = 0 when all */s 
are equal, it implies that if we bridge some proportion of the gap from 
some unequal distribution to any equality, inequality is reduced by more 
than this proportion. 

Convexity implies quasi-convexity, i.e., a distribution in which each 
income is the same average of what it is in several other ones is not more 
unequal than the most unequal of the latter. This is equivalent to saying 
that if the latter have the same degree of inequality, the average’s is not 
higher. By average we mean a weighted linear one, but in this definition we 
can restrict ourselves to such an average with given weights—for instance 
to arithmetic averages—and also to the consideration of only pairs of 
averaged distributions. Quasi-convexity is “strict” when “not more 
unequal” can be replaced by “less unequal.” 

Quasi-convexity, in turn, plus symmetry, imply Schur-convexity which 
implies the transfers principle” (rectifiance) and has interesting equivalent 
properties such as the mentioned one between inequality measure and 
Lorenz, or concentration, curves (“isophily”). 



UNEQUAL INEQUALITIES 


93 


All these convexity, quasi-convexity and Schur-convexity properties 
could be either valid for any distributions, or restricted to relations between 
distributions of the same total amount and which hence have the same 
average income (but they would then hold for all levels of these total or 
average incomes); in this latter case, we will add the adjective “constant- 
sum” to the property. 

IX.b. Results 

The results will roughly be the following. If and a are subadditive. 
/, and a/x present weighted subadditivity. I,, I c , If, a are convex. I r , 
7/, If, ajx are constant-sum convex. 

Let us be more precise. 

For 7/ and a: a sum of inequalities is not larger than the inequality of the 
sum; a sum of inequalities of nonproportional distributions is smaller than 
the inequality of their sum. For I T and a/x: a sum of inequalities weighted 
by average or total incomes is not larger than the inequality of the sum; a 
sum of inequalities of nonproportional distributions , weighted by their 
average or total incomes, is smaller than the inequality of their sum. 

1f and a are strictly convex for distributions which are not all proportional 
to each other. I, is strictly convex for distributions which do not all differ 
from each other by the same difference in all incomes. 

The properties of the exceptions to all these cases are already known. 
Sums and weighted sums of proportional distributions are also propor¬ 
tional to them. They thus all have the same l r and a/x. And their I r a and a 
are as their proportions are: the sum's is the sum and the weighted sum’s 
is the weighted sum. As for 7 ( , sums and weighted sums of distributions 
which differ from each other by an equal difference in all incomes also 
belong to this class, and they all have the same 7 ( . 

Finally, we shall also find for I r and If properties which extend the 
subadditivities of If and I r and which we shall call “pseudo subadditivity” 
and its weighted form. Also, the cases where 7 r is not strictly convex will 
appear. 

IX.c. Demonstrations 

IX.c.l. Rightist and centrist measures [1, Theorem 22]. Let x k be 
several distributions with index k, x k their average incomes, X k = nx k 
their total incomes. 

For •*(.*) = x—If(x) — [din] Y. at} - *] 1 / 1- * or (77**) 1 /" with e > 0, 
Minkowski’s inequality gives x(Y x *) ^ Y x(x k ) and thus 


If (X **) f Y. I A* 1 ) 



94 


SERGE-CHRISTOPHE K.OLM 


with equality if and only if all **’s are proportional: inequality measure 

J T a is subadditive. 

Since If - xl r , this gives 

ir (i *') < i (*yi *‘) -1 (We xk ) ir(xi) 

with equality if and only if all the x k 's are proportional (they then all 
have the same I r which is also their sum’s). This is the property of wieghted 
subadditivity of I T . 

Subadditivity and linear homogeneity of I T a show that 



¥-) 


= \ /,“(*' + **) < 


/,*(*») + W) 

2 


i.e., I r a is a convex function of x, with strict convexity for nonproportional 
x’s. 

1 r — Ifjx is thus convex at given x and X — nx, whatever these levels 
of x and X. We may call this property constant-sum convexity of l r . 

The change in variables which transforms if into / f does not affect 
the convexity properties. I c is thus convex in x. And 1/ - IJx is therefore 
convex at given x or X — nx: it also has the constant-sum convexity 
property. Furthermore, 7 r a ’s subadditivity and this change of variables 
show that I c has the following pseudo subadditivity property where m is 
the number of k 's 


( VI \ Vi 

X **+ < X 4(** + 0, 

fc=l / A=1 

and that // thus has the weighted pseudo subadditivity property 

h T (X ** + mf) < X (**/l **) h r (x k + 0 X (Jf*/X xl ) Vi** + £). 

FX.c.2. Standard deviation and coefficient of variation. Call p(x) — 
Ol From Minkowski’s inequality, p(X x k ) sj p(x k ) with equality 
if and only if all x k 's are proportional. That is, p(x) is subadditive. As 
shown above, this implies it is convex, strictly for nonproportional x's. 
Changing variables from x,- into x t — x and p into pjn does not change 
concavity for given x: a — [{lIn) X ( x i — x ) a ] 1/2 is thus convex for given x, 
i.e., for x k s with same average x k . And since such x k 's cannot be propor¬ 
tional without being equal, a is strictly convex for given x. But it is also 
linearly homogeneous. Therefore, the hypersuface graph of I(x) is a cone 



‘UNEQUAL INEQUALITIES 


93 


whose summit is the origin and having a strictly convex base in the hyper¬ 
plane £ x { = nx for this given x: it is in this sense a “strictly” convex 
cone. Therefore, the results are that a is both a convex and a subadditive 
function of x, strictly for nonproportional jc ' s . In particular, 

° (I **) < Z »(**■) 

with equality if and only if all x k 's are proportional. 

Consequently, the coefficient of variation ojx has the properties of 
constant-sum convexity and of weighted subadditivity with weights propor¬ 
tional to average or total incomes and equality if and only if all **’s 
are proportional (they then all have the same o/x which is also their sum’s). 

IX.c. 3. Leftist measures. For /, , *(*) — x — J t (x) — (1/a) X 

Log((l/n) £ £-**<) with a. > 0 will be shown to be concave by the method 
of directional derivatives: we choose n numbers z,, replace x { by x, + z,t, 
and compute the derivatives x‘ and x" of x for t at t = 0. x(x) is concave 
if and only if x" < 0 for all z,’s. Let us thus write 

e- 5 = (1 /n)'£e-* (x ‘ + “ t) . 

Differentiating twice gives 

e a *x =- (1 /n)^e t 

and 

e~**x" - - (<*/«) £ e~* x 'z?- 

for / — 0. For this t, carrying x 1 from the second equation into the third 
one and using the first one, we finally obtain 

(«*/“) e-h" = (X e~^z t f - (£ e^% 2 )(l 

Let us now apply Cauchy's theorem (“the square of a sum of products 
is smaller than the product of the sums of squares, unless the variables 
are proportional”) to the two series of numbers e~ aa: </ 2 z, and 

(Ze-%) 2 

with equality if and only if all z ( ’s are equal. Thus, x" ^ 0, with equality 
for equal variations of all x<’s and only in this case. And finally Ifx) = 


6 42/i3/i-7 



96 


SERGE-CHRISTOPHE KOLM 


* - x(x) is convex, strictly except on the directions of equal changes in 
all x/s. 

]' = JJx is thus constant-sum convex. 

fX.d. Applications of Sub-Additivity of Inequality 

Per person and total inequalities are linearly homogeneous and convex 
if and only if the corresponding per pound inequality is intensive and 
constant-sum convex. This is equivalent to respectively subadditivity 
and weighted subadditivity of these measures. (These relations straight¬ 
forwardly result from the precedent demonstrations). This subsection 
deals with such inequality measures. It thus applies in particular to the 
per person If and a and the corresponding per pound I r and ajx. Its 
object is to show examples of applications of the subadditivity properties. 

If for instance Y is national income, / g i 0 bai the inequality in its distri¬ 
bution, and Teamed , T unearned , -learned ' /unearned respectively the total 
earned and unearned incomes and the inequalities in their distributions, 

/global ^earned /unearned 

for per person and total inequalities, and 

/global ( learned/ T) /earned "4~ ( Tunearned/ T) /unearned 

for per pound inequalities. We know too well that the condition for 
equality in these relations does not hold. 

We can also consider the effect of growth on income inequality. If 
the .y a 's are successive income increments, the relations of the precedent 
subsection are between the inequality in some incomes £ x k and the 
inequalities in their successive increments. But let us rather call now x 1 
the income distribution at time t, and write I t — I(x l ) for inequality 
at time /. 

For per person or total inequality, the following relation holds between 
income inequalities in years t and t 4 - 1 and the inequality in the yearly 
increments Ax 1 — x*^ — x 1 -. 

/f+i < f + I(dx‘), 
or 

I ul -I t < I(Ax% 

with equality in the relation if and only if all incomes grow in the same 
proportion; that is: the increment of inequality is lower than the inequality 



UNEQUAL INEQUALITIES 


97 


of the increment, except when all incomes increase in the same proportion, 
in which case they are equal. 0 being a time interval, we can also write 

It + B < h + AX 1 *' ~ **) ' 

or 

$ ^ e e ) 

the last equality holding because of the linear homogeneity of these 
measures. Letting 6 tend to zero, and using Newton’s dot to indicate time 
derivatives, this inequality becomes 

1 < /(A). 

Equality again holds when * and x are proportional, and it does not hold 
when they are not. 8 This again says, but now for the time derivatives, 
that the increment in equality is not higher than the inequality in the 
increment, both being equal if and only if all incomes grow at the same 
rate. This result is also conveniently written as between the growth rate 
of inequality /// and the relative inequality of the growth tendency 
l(x)jl (if I > 0), as /// /(*)//: the inequality growth rate is not higher 

higher than the relative inequality of the growth tendency , and they are 
equal if and only if all incomes have the same growth rate. 

For per pound inequality, we similarly have 

with equality if and only if x\ *‘ +1 , and Ax 1 are proportional, in which 
case their three inequalities are equal. Writing 

X t+e I t+e ~ X'l t ^ X t+e - X' / x^ e - X 1 \ 

e e ' 1 1 e ) 

(since I(x l+e — x‘) — l((x ue — x')jd) for these measures), and letting 6 
tend to zero, we obtain 

ix+ x i < x i(x). 

.... 

The same remark as above holds for the equality and strict inequality; 
cases. But, now, proportionality between * and x means I — I(x), which 

8 The passage to the limit does not guarantee this assertion. But it holds because if, in 
K" +l space (/, x), we consider the convex half-cone graph of /(*), and the half-cone 
translated from it and whose summit is point (/, x), the latter half-cone lies completely 
in the interior of the former one except for its ray on the line from origin to this point, 
which they have in common. 



98 


SERGE-CHRISTOPHE KOLM 


the relation, with equality, shows to be equivalent to / = 0 (since X > 0). 
Apart from these cases and for 1 > 0, the relation can be written'as 

1/J r /(*)-/ . 

,X/X ' 1 ’ 

the growth rate of inequality is lower than the growth rate in global income 
times the relative excess of the inequality in growth tendency over the present 
one , except when all incomes grow at the same rate, in which case inequality 
does not change. 

We can also write a relation about the effect of government welfare 
transfers on income inequality: the income inequality after transfers is 
lower than the before transfers one plus the inequality in transfers (for 
per person or total inequalities; for per pound ones the sum should be 
weighted by the respective volumes of incomes and transfers). The equality 
case in the relations is of course irrelevant. But for the same reason the 
result is not very informative in this case, since one of the usual reasons 
for transfers is to decrease income inequality. However, the relation 
becomes much more interesting if we consider the money equivalent of 
government services for the persons, so as to see how the inequality of 
benefits from government expenditures mixes with that of private incomes. 
Again, total inequality would generally be lower than the sum of these 
inequalities (weighted by private income and government expenditure 
levels for the per pound measures). But, for this problem, the equality 
case (proportionality of benefits to incomes) has a high degree of empir¬ 
ical plausibility. 

All this sounds rather optimistic, after all; if we add incomes, inequality 
increases less (per person), or is lower than the highest (per pound). 
However, the main tool to affect the inequality of incomes in our society 
does not add to them but subtracts from them: it is the tax system. 
Now, when adding income distributions, we excluded the possibility of 
“negative incomes, by the very definition of an income distribution. 
But if some variation is a decrease in all incomes (or at least no increase 
in any), we may consider it as a positive (or nonnegative) addition to the 
final distribution to obtain the initial one. Thus, if /„,, f bt , I t respectively 
are inequalities in after tax and before tax incomes and in the tax distri¬ 
bution, we have 

fu l? ht ft 

for inequality per person or total, and, calling Y the global income and 
T the tax revenue, 

f” XVKY- T)) 1 H — (Tf( Y — T)) I t 



UNEQUAL INEQUALITIES 


99 


for inequality per pound. So, after tax inequality is not lower than before 
tax inequality less fiscal inequality (with weights equal to the respective 
amounts in the case of per pound inequality). It is equal to this difference 
for a proportional income tax, and only in this case (this of course does 
not imply that it is the structure which gives the lowest after tax inequality). 

IX.e. Diminishing Returns in Equality 

We have found that inequality measures 7 t , l t , 7 r “, a are convex, 
whereas I r , I', J e T , a/x are constant-sum convex. This gives them some 
“diminishing returns” property, which they share with any other measures 
which are convex, or convex in some sub-spaces of the x space. This 
property is described by relations between the inequalities of several 
distributions. These distributions must have the same total sum or average 
for the constant-sum convex measures (but then the property holds for 
all such sums or averages), whereas no such restriction holds for the merely 
convex measures. 

This property can be expressed as: if we move regularly along a line 
in .v space, the increase in inequality becomes faster and faster, or the 
decrease in inequality slower and slower. There is, of course, a limiting 
exception for the fully convex measures (7 r °, 7 C , I t , a), obtained when this 
line is a projection on x space of a line located on the hypersurface 7(x), 
ray of the cone graph of 7 r ", a or 7 e , or generatrix of the cylinder graph 
of 7, (i.e., line parallel to the equality direction e or A, along which 7, 
is constant): inequality differences vary proportionally to distance 
differences for the former (for 7 r ° and a, this is an equiproportional 
variation in all x.’s), and inequality does not change for the latter (equal 
variation in all x.’s); these are the cases for which equality holds in the 
following relations. x° and x l being specific distributions, the property 
can thus be written either classically as 

7[Ax° +(l - A)* 1 ] < A ■ 7(*°) + (1 - A) • 7(.v>) 
for 0 < A < 1, or differentially as 

Z fo - *, o )(c>/(*)/0*,) > 7(x) - 7(* 0 ). 

It is particularly interesting to take a reference point with equality: 1 
call £ its coordinates; we have 7(£e) = 0. Then, for all x’s whose x,’s 
are not all equal (I(x) ^ 0), convexity gives: 

7[A • (x - £e) + £e] ^ A • l(x) 

depending on A Sj I, whatever £ for 7,, for £ > 0 for 7 r ° and cr, for 
£ > — $ for I c . The inequalities are reversed if £ < 0 for 7 r ° and a, 




100 


SERGE-CHRISTOPHE KOLM 


and for £ < -£ for J c . In all cases, a possible £ is * so that the trans, 
formation is mean preserving; this is a case of the former category for 
I “ a, and I c . But £ = * is the only possible case for the constant-sum 
convexity case of/,, h\ IS, op. The property then is 

/[A ■ (x — xe) + xe] Sg A • I(x) 


depending on A Sg 1. 

If, in particular, we take the origin as this point when possible, we find 
again some previous results: /(Ax) Sg /(*) depending on A 5g 1, for 
inequality measures h , I c (for £ > 0) an d °- 


X. Union of Populations 
X.a. Required Properties 

If two countries which display the same degree of inequality unite to 
form a unique country, will we want the measure of inequality to indicate 
that inequality per person or per pound in the latter is the same one as 
in the constituting countries? No, because if each of the two initial countries 
has only one inhabitant, its income distribution will display no inequality, 
whereas the union country will be an unequal one if these two persons 
do not have the same income, and, more generally, if income is equally 
distributed in each of these two countries, but average incomes differ, 
inequality is inexistent in these initial countries but exists in the union. 
Thus, what we might want is total inequality in a union of populations 
to be larger, or not smaller, than the sum of the total inequalities in the 
constituting countries. That is, if k is an index representing a population, 
n k and I k a the number of persons and the inequality per person in popu¬ 
lation k, n — £ n* and I a the total number of persons and inequality per 
person in the union, 

nl a :> £ n k I k a . 

k 

This is equivalent to the relation between per person inequalities 

/a ^ I («*/«) V, 

k 

the right-hand side of which is an average of inequalities per person, 
appropriately weighted by the number of persons in each population. 
Calling x k , X k = n k x k and I k T the average and total incomes and the 
inequality per pound in population k, and * = £ (njn) x k , X = nx = £ X k 



UNEQUAL INEQUALITIES 


101 


and T the average and total incomes and the inequality per pound in 
the total population, these inequalities are also equivalent to the following 
one between inequalities per pound 

where the right-hand side is an average of per pound inequalities, appro¬ 
priately weighted by the number of pounds (total incomes) in the popu¬ 
lations. 

The general result is that these relations hold for all the inequality 
measures we have considered until now, the equal sign holding if all the 
constituting populations have the same inequality and the same average 
income (but, for most measures, not only in this case). 

Let us consider separately the measures with “welfare independence” 
property, and standard deviation with coefficient of variation. 

X.b. Independent Measures (1, Theorem 21] 

We have the following results, for inequality measures of the type 
T = x — x, I r = T/x, nl a . 

With the independence property and the basic properties of inequality 
measures (zero at equality, positive out of it, impartiality-symmetry), 
the above relations hold for all populations and unions. They hold with 
equality (resp., strict inequality) if and only if all constituting populations 
have (resp., have not) the same equal equivalent income. We recall that 
equal equivalent incomes are the same if both average incomes and 
inequalities are (but this sufficient condition is not necessary). 

And if, for an inequality measure which is independent, impartial and 
zero at equality, these relations strictly hold for unions of populations which 
do not all have the same equivalent income, this measure is positive out of 
equality (if these relations just hold for all unions of populations, the 
measure is nonnegative). And positivity out of equality is then equivalent 
to strict transfers principle and strict "isophily" (a small transfer from a 
richer person to a poorer one decreases inequality, a Lorenz curve 
nowhere lower and somewhere higher implies lower inequality for distri¬ 
butions of same total and average incomes) and even to constant-sum 
strict quasi-convexity of the measure in the x,'s (inequality of a distribution 
which is an average of several other ones of same total income is smaller 
than the largest of the latters’ inequalities) (cf. Section XI below). 9 

* However, it is not equivalent to convexity, or even quasi-convexity, of the per 
person measure in the set of the x.’s, although both hold together in the special cases 
of independent measures studied above (/,, /„, /,“). These convexities imply non¬ 
negativity of the measure, but the converse is not true (cf. Section XI below). 



102 


SERGE-CHRISTOPHE KOLM 


These results show that if all the constituting populations present the 
same degree of inequality, the global population will generally not have 
itself this inequality: it will generally be more unequal than each of its 
components. The exception, where global inequality is equal to the equal 
inequalities of the constituent populations, occurs if and only if the latter 
furthermore have equal average incomes (since this is then identical to 
saying that they have the same equal equivalent incomes), This neatly 
shows the double dependence of global inequality upon inequalities both 
within and between the constituting populations. 

We note that H. Dalton’s “principle of equiproportionate additions 
to persons” [8] is a special case of union of populations with the same equal 
equivalent income, since it comes to lumping together populations which 
duplicate all persons by the same numbers and thus have the same 
<p( x) — (1/n) X <p( x i)- All these populations also have identical average 
incomes and inequalities. 

The proof of these results is straightforward. We call J k the set of indices i 
of persons in population k, x k the equal equivalent income of population k, 
x the global equal equivalent income. We choose an increasing specification 
of function <p, which is always possible since a <p can be replaced by 
a<p + b with a negative a; I > 0 out of equality is then equivalent to 
strict concavity of <p (cf. Section Xl.d below). Then, 

T (x) - h £ <p( Xi ) - I <r(x,) = X ~<p(* k ) < 9 (l~**) 

with equality if and only if all x k 's are equal. This last inequality and 
precision is equivalent to the strict concavity of <p. And the comparison of 
the first and last terms is equivalent to x < £ (n k jn) x k with the same 
precision since <p is increasing. We also have nx = £ n h x k . With 
l k a = g* — x k and /“ — x — x, the above mentioned results follow. 

X.c. Standard Deviation and Coefficient of Variation 
As for the standard deviation <x, summing 

(*. - *) ! == (x t - X k + ** - xY 

■== (*, - x*Y + (x k - x f + 2(x, ~ X k )(x k - x) 

over / e J k ., and then over k , gives 

ct2 = L (V")k * 2 + (x k - i) 2 ] 

k 



UNEQUAL INEQUALITIES 


103 


which neatly shows the separate effects on total inequality of intra¬ 
population inequalities o k , and inter-populations inequality of average 
incomes £ (n k in)(x k - *)*. Thus, , 

o* > X («*/«) > (l («*/") o*) 

with equality if and only if = .* for all k's for the first inequality and 
all o k 's are equal for the second one. Therefore, 

a > X ("kin) °k 

with equality if and only if all populations have both the same average 
income and the same standard deviation of incomes. This is also equivalent 
to the required relation between “total inequalities” no, and between the 
“per pound” coefficient of variation: 

atx > £ (X k /X) o k jx k . 

These properties are thus exactly the same ones as for the other measures 
under consideration, the only difference being that it is now both necessary 
and sufficient that both average incomes and inequalities be the same for 
all populations in order that the equality sign holds in the relations. 


XI. General Structures of Inequality Measures 
Xl.a. The Problem 

We started from specific measures of inequality, then considered 
measures of more and more general form, and economic properties which 
belong to still much more general classes of measures (such as the economic 
meanings of intensivity, equal increase in all incomes, subadditivity, 
convexity, quasi-convexity, Schur-convexity, etc.). We consider now these 
more general structures, the economic consequences of which have already 
been discussed a propos the properties of more specific measures exhibiting 
them. 

We shall consider properties pertaining to the distribution * = {*,} 
O' I,..., n), its average income x = £ (xjn), the “evaluation function” 
Y(x), the equal equivalent income x(V(x) = V(xe)), and the measures of 
per person and per pound inequality 1° = x — x and I T = /“/*; when 
a property holds for both I a and 1 T , we shall mention it for When 
it holds only at given x, we shall add the adjective “constant-sum.” 

The subject matter will be properties of the functions of x V, x, I a , and I r ; 
the topic will be both to relate the corresponding properties of these 



104 


serge-chrjstophe kolm 


functions, and more interestingly to establish the general relations between 
these different properties. V can have only ordinal properties. The 
properties of* will be used, and they are quite obvious: as a function of * 
it is increasing, symmetrical, intensive, increased by some amount if all 
x/s are increased by this amount, at the limit of the principle of 
diminishing transfers,” and altogether weakly concave, convex, quasi¬ 
concave, quasi-convex, Schur-concave, Schur-convex. 

/ = 0 if all x,’s are equal by definition of x and I. Symmetry of V, x, 
and / go together (“impartiality”). We assume V is an increasing function 
of the x,'s (“benevolence”), so that x is well defined and has the same 
property. 

Xl.b. The Various Convexities and Concavities 

We first point out the general relations between the various kinds of 
convexities and concavities. They will be applied to the functions of x V, 
x , /“, T. The following sentence implies four sentences: we can replace 
“concavity” by “convexity,” and in each case this word can mean either 
the “strict” or the “weak” property. Concavity implies constant-sum 
concavity and quasi-concavity, either of the latter two implies constant-sum 
quasi-concavity, constant-sum quasi-concavity plus symmetry 10 imply 
Schur-concavity which implies rectifiance (“transfers principle”). These 
relations apart from the last two ones, result from the fact that the inter¬ 
section of two convex sets is convex. 

To prove the last but one, we call F a constant-sum quasi-concave and 
symmetrical function of x, B a bistochastic matrix (an n X n square 
matrix whose entries b n are nonnegative and satisfy Y.t bit — Hj b tj = 1 
for all / and /’s) P* the permutation matrix of index n (P* is a B whose 
entries are all 0 or 1), A„ weights (A„ > 0 for all n's, £A„ = 1). All 
P’x have the same average *, and F(P’x) — F(x) is the definition of F’s 
symmetry. Birkhoff’s theorem [7] says that B can be written & £ A n P n . 
Ostrowski’s theorem says that rectifiance (“transfers principle”) and 
symmetry for a weakly Schur-concave F are equivalent to F(Bx) F(x) 
for all B s. And F's constant-sum weak quasi-concavity implies 
^E A n (P n x)] > F(x). The result is then proved by 

F(Bx) = F A n P”xj f F(x) 

for the weak properties. For the strict ones, we notice that strict Schur- 
concavity of x is F(Bx) > F(x) for ail x’s and B's such that Bx is not a 
P”x. And if Bx — ^ A „P v x is not a P”x, the P n x , s for the nonzero A„’s 

10 Constant-sum symmetry would be enough. It is implied by symmetry. 



UNEQUAL INEQUALITIES 


105 


arc not all equal. Then, constant-sum strict quasi-concavity of F implies 
> F(x). The result is then proved by 

F(Bx) = f(£KP"*) >F(x) 

for Bx not a P*x. Finally, reversing the inequalities proves the relation 
for the convexities. 

Now, we also have the following property for an inequality measure 
I(x) however defined (not necessarily as /• or I T above). If I(x) is zero at 
equality and symmetrical (“impartial”), and if it is either rectifiant (thus, 
Schur-convex ), or constant-sum quasi-convex, or constant-sum convex, or 
quasi-convex, it is positive out of equality for the strict forms, and non¬ 
negative for the weak ones (thus a symmetrical, zero at equality, convex 
I(x) is nonnegative, strict convexity being excluded by the zero at equality 
condition). All these properties result from the one with Schur-convexity. 
Strict Schur-convexity means I(Bx) < I(x) if B is a bistochastic matrix 
and flx's coordinates are not a permutation of x’s. If we take a B whose 
entries are all 1 In, the latter condition just requires that all x/s are not 
equal, we have Bx = xe, and thus 

I(x) > I(Bx) = /(xe) = 0. 

For the weak forms, we replace > by >. 

Coming back to I functions which are 7° or 1 T defined as above from 
K(x) and x, these definitions, any definition of Schur-convexity or -con¬ 
cavity and the above property show that Schur-concavity of V, of x, 
and Schur-convexity of I(I a and I r ) are equivalent, and they imply x < x, 
i.e., I > 0, out of equality (for the strict forms, the same holding for the 
weak ones with replacement of > by ^ in the properties and proofs). 

But we also remark that constant-sum quasi-concavity of V and x are 
equivalent to constant-sum quasi-convexity of I (/° and I T ), with corre¬ 
spondence between the strict and weak forms. This is so because 

— J— - * ( — ' 2 ~~~ ) < Max(x J — x 1 , x 2 — x s ), 

which is equivalent to 



is equivalent to 

( *1 4. ,! i 

-——) > Minfx 1 , x*) 



106 


SERGE-CHRKTOPHE KOLM 


if s' = x*. The strict form then results by considering x 1 * **, and the 
weak form by replacing > by 3*. Of course, the relation for l r = /“/* is 
the same as that for 7°. 

XI.c. Homotheticities and Translatednesses 

A function F(x) is said to be homothetic when its hypersurfaces 
F(x) -- constant in .%■ space are transformed into each other by an 
homothetic transformation the center of which is the origin. We shall say 
it is Jf-homothetic when this property holds with the only difference that 
the center is point X in x space. The former homotheticity thus is 
(Fhomotheticity. When such a center goes to infinity, the relation between 
the hypersurfaces in * space F(x) = constant becomes that they are 
translated from each other in a given direction. When this direction is 
that of a vector X, we shall say that F(x) is a X'-translated function. We 
now have the following properties. 

V is homothetic , x and l a are linearly homogeneous, 1 T is intensive, are 
equivalent properties. The relation between x, I“, 7 r just results from the 
definitions of 7. On the other hand, we know that V is homothetic if 
and only if it has a linearly homogeneous specification, i.e., there exists 
increasing functions 0 and t^such that K(x) — 0[W(x)] and W is linearly 
homogeneous. This property and the definition of x then give W(x) — 
lV(xe) — x ■ W{e) which shows both the relation between W and x 
and the linear homogeneity of x. 

V is e-translated, an equal variation in all incomes affects x in the same 
way, and it does not change /“, are equivalent properties. The relation 
between x and 7“ is obvious. The one between V and x is deduced from the 
previous paragraphs by a change of variables which consists in replacing 
x, by a Xi and x by a x where a is any positive constant. 11 

Intermediate cases between these two happen when V is -e£-homothetic 
(| is a scalar), which is equivalent to x + £ being multiplied by A when all 
x ( + fs are, as is seen by the change of variables from x,- to .v, + £ and 
x to x + £ in the homothetic case. 

Now, a marriage between these properties of homotheticity or trans- 
latedness on the one hand, and quasi-concavity on the other hand, gives 
an interesting offspring: convexity of 7. More precisely: if V is both 
X-homothetic or X-translated and quasi-concave, x is concave and 7“ is 
convex, I r therefore is constant-sum convex; and if K's quasi-concavity 
is strict, x s concavity and 7“’s convexity are strict out of lines issued from 

11 We similarly find the result that a function F(x) is T-translated if and only if there 
exists functions t> and W such that F{x) s (P[F(x)] with <F such that F(x + \X) =-- 
W(jf) + k\ (where k is a constant). 



UNEQUAL INEQUALITIES 


107 


point X (for the homothetirity case) or parallel to vector X (for the 
translatedness case). In particular, X can be 0 (V homothetic) or —ge 
for the homotheticity cases, or e for the translatedness case. To prove 
these results, apart from obvious implications in them, we can change 
variables to bring all the cases back to the one of a homothetic V, and then 
prove the only nontrivial relation by recalling that homothetic quasi¬ 
concave V implies linearly homogeneous quasi-concave x, which in turn 
is concave from [9, Chap. VIII, Sect. 8, Theorem 3]. 

This specific case is of special interest since V homothetic quasi-concave, 
/.<?., x linearly homogeneous and concave, and thus /“ linearly homogeneous 
and convex, occur if and only if I a is subadditive. 

Xl.d. Independence and Convexities 

There still is another property, which we used at length in previous 
sections: “independence,” or additive separability of V. Our starting point 
was that this structure, plus impartiality-symmetry and A'-homotheticity 
or V-translatedness, give the specific forms of x and / previously discussed 
(the symmetry imposes X to be parallel to e). But “independence” also 
interfers interestingly with the various kinds of convexity. 

We consider again I(x) defined as /“ = x — x or I r — I a jx, with x 
defined form V(x) = V(xe) with fan increasing function (“benevolence”). 
“Independence” means that there exists n + 1 functions of one variable 
<P and <p* (/ = I,..., n) such that V = 0[£ cp^x,)]. Since V is increasing 
in all the relevant domains, 0 and the <p’’s must be monotonic with the 
same sense of variation, and we can always assume they are increasing 
functions, which we do. V is, furthermore, “impartial” (symmetrical) if 
and only if we can take the same function <p for all the qd'%. This is the case 
if cp‘(y) — <p(y ) + c* with constant c* for all i's since, then, calling 
¥*(r) - 0(z -i- Y. c') makes 0[X identical to 9>(*.)]- Recipro¬ 
cally, if an “independent” V is “impartial,” <p\y ) = <p(y) + c‘ for all i's 
since this symmetry implies + T’HTz) - + ^(Ti). i-e., 

f /'(>’i) — y'(>’a) is the same for all i's. 

The general results are the following. 

Assuming the respective differentiabilities when required, with “ inde¬ 
pendence,"' the properties within each of the two following groups are 
equivalent to each other, with correspondence between weak and strict forms 
(and the second group implies the first one's weak form): 

(1°) “ Rectifiance" (“transfers principle") - , I is nonnegative (positive 
out of equality for the strict form)] I is Schur-convex, or V or x is Schur- 
concave ; “impartiality ” plus either of the following properties: I is constant- 
sum quasi-convex, V or x is constant-sum quasi-concave or quasi-concave, 



108 


SERGE-CHRISTOPHE KOLM 


£ <p(x { ) is Schur-concave or constant-sum quasi-concave or quasi-concave 
or constant-sum concave or concave, <p is concave. 

(2°) An impartial per person inequality measure 7° is convex-, <p'l<p" 
is convex (i.e., <p'<p m t<p" 2 is decreasing). 

One of these results is that, with “independence,” “rectifiance” implies 
“impartiality,” and thus suffices by itself to define Schur-convexity of 7 
or Schur-concavity of V. This is so because the rectifiance conditions with 
two “incomes” y, and y t such that y x < y t imply both q>‘'(yd ~ 
q>y(y t ) > 0 and ^(y,) - <p y (yd < 0 (resp. ^ and < for the weak 
forms), and thus, when y y and y* tend to the same value y, <p‘'(y) = <p r (y), 
for all admissible y. Integrating shows that the can all be written as 
(pt(y) = y(y) + c, where c, is a constant, which proves the symmetry 
of £ <p< and the “impartiality.” Rectifiance then means that <p' is a 
decreasing (resp. nonincreasing) function, i.e., that <p is a concave function 
(with correspondence between strict and weak forms). 

Now, concavity of <p is also equivalent to concavity of £ ?>(•*/) with 
correspondence between weak and strict forms: -v 1 and x 2 being two 
different x’s and A a scalar such that 0 < A < 1, 

A^fjc, 1 ) + (1 — A) <p(x 2 ) > ^[A*, 1 -f (1 — A) x, 2 ] 
for all f’s implies 

A £ cp(xr) + (1 - A) £ 9 >(*, 2 ) :> £ <p[W + (1 - A) x 2 ], 

and if <p’s concavity is strict, the first inequality is strict for at least one i 
(since X, 1 x 2 for at least one /), and the second inequality is strict; 
conversely, if £ <p(Xj) is concave, choosing x’s each with equal x t ’s shows 
that <p( y) has the same concavity (strict or weak). 

£ <p(*>)' s concavity in turn implies its quasi-concavity and its constant- 
sum concavity, either of which implies its constant-sum quasi-poncavity, 
which, with its symmetry, implies its Schur-concavity, and thus its recti- 
fiance, which is equivalent to <p’s concavity and thus to £ q>(x,)\, with 
correspondence between strict and weak forms. All these properties are 
thus equivalent to each other. Besides, the ordinal properties of quasi¬ 
concavity, constant-sum quasi-concavity and Schur-concavity are the 
same ones for £ <p{x f ), K(x), and .*(*). And V and x's constant-sum quasi¬ 
concavity and Schur-concavity are respectively equivalent to constant-sum 
quasi-convexity and Schur-convexity of I (7° and 7 r ), with correspondence 
between strict and weak forms. 

Schur-convexity of 7—and thus, with “independence,” mere rectifiance 
(transfers principle)—was seen to imply 7^0 for the weak forms and 
7 > 0 out of equality for the strict ones. The converse is obviously also 



UNEQUAL INEQUALITIES 


109 


true with "independence” and “impartiality,” since 99(f) > (1/n) Y <p{x^ s= 
tp(x) for all admissible *’s is equivalent both to f ^ x , i.e., / > 0, for aH 
admissible x’s and to 99’s weak concavity, and <p( f) > (1/n) 2 = ?>(*) 

for all admissible x’s such that the x,’s are not all equal is equivalent both 
to x > x, i.e., / > 0, out of equality and to 99’s strict concavity. 1 * 

But, furthermore, with “independence,” /’s nonnegativity or positivity 
out of equality implies “impartiality” and is thus by itself equivalent to 
the other mentioned properties. To show this, let us consider a small 
transfer of « from i to j and from j to i, starting from a situation of equality 
where x k — y for all k's. If k receives «, 

d<p k (y) — €tp k '(y) + (e i l2)[<p' t ’(y) + (),*(«)] 

where <V L (e) tends to zero with e. If it is taken e (i.e., if it receives —e), 

d<p k (y) = —e<p k '(y) + («*/2)[99*'O0 + 0/(«)] 

where 0 2 '(e) tends to zero with e. Thus, the effect on Y <P*( X <) of a small 
transfer e from i to j is 

<1 [I ?*(**)] = <• W\y) - <p>'(y)] + (*V2W(y) + <p>'(y) + 0^)] 

where 0,(t) tends to zero with e, and the effect of the reverse transfer is 

9 ? *(**)] = — « - [^''(.v) — <p’\y)] + (* i /2)[<p i ‘(y) + <p’"(y) + 0 2 (c)] 

where 0 2 (e) tends to zero with e. These two operations do not change 
x -=y. If/ ^ 0 (resp. >0 out of equality), in the new situations we must 
have x < y (resp. <) and thus Y <P k ( x k) = Z 9 fc (*) ^ Z 9 k (y) (resp. <) 
by definition of x and since the <p k ’s are increasing functions. That is, 
the two written variations of Y <P k must be nonpositive (resp. negative). 
When e is small enough, this requires both <p‘'(y) < 9 and 
<P l '(v) 5= <p r (y), i.e., (p { '(y) = <p’ (y). Integrating shows that all the <p*’s 
are of the form <p k (y) = <p(y) + c k where c k is a constant, which means 
that Y <P*> V, x, I are symmetric functions. Given this result, the non¬ 
positivity (resp. negativity) of the above differentials when e is small 
enough becomes equivalent to 99" < 0 (resp. 9 ' < 0 almost everywhere) 
i.e., to concavity of 99 (weak or strict). 


n More generally, £>'(*,) is strictly (resp. weakly) concave if and only if each <p‘ 
is strictly (resp. weakly) concave. The sufficiency is proved in the same way by replacing 
f <*<) by p'C*,), and the necessity is proved by considering x’s which differ by only one 
of their coordinates. 



110 


SERGE-CHRISTOPHE KOLM 


Concavity of I <p(x,) is one of these equivalent properties. If S were 
concave, which is equivalent to I a — x — x being convex (and implies a 
constant-sum convex 7 '), * and V would be quasi-concave and all the other, 
mentioned equivalent properties would follow. Now, the specific <p(y)s 
by which we begun the study, y 1 -, -e~‘\ (y + f) 1 "', were shown to give 
convex 7°'s for e and a which give nonnegative (resp. positive out of 
equality) or Schur-convex /“’s. But this is not true for all <p’s. Rather, by 
imitating the proof of [5, Theorem 106] 13 we can show that an “inde¬ 
pendent impartial’’ inequality measure 7“ with <p" < 0 is convex if and 
only if <p’l<p" is convex (i.e., <p>7<p”'* is decreasing). Recalling that 7, 
and its special and limit cases If and 7, constitute the class of independent 
inequality measures with linear we see that they fall in this category 
and we find again that they are convex. 

XI.e. Generalizations of the “Transfers Principles," “Rectifiance," Schur- 
Convexity, “ Isophily ” 

In this section, we relate very briefly and without proof nor precision the 
relation between some of the above results and other meaningful properties 
of inequality measures (cf. [11-13]). 

In Section VII, we have seen that I a x — <p _1 [(l/«) £ <p(x,)] with 
concave <p, or 7 r — l a /x, always satisfies the “principle of diminishing 
transfers” if and only if <p', <p", and <p"' alternate in sign, if these derivatives 
exist. The satisfaction of this “principle” for all admissible x,, , x k , x x , 

for a Schur-convex l(x), constitutes a generalization of Schur-convexity 
which we may call 2-rectifiance (or rectifiance of order 2). The precedent 
property is the form it takes for “independent” measures and existence 
of these derivatives. If we call z, — , and V(x) a social evaluation 

function (increasing function of x — x — 7 or x ■ (1 — /)), I is 2-rectifiant 
if and only if V(x') > K(* 2 ) for all x\ x 2 such that z 1 > z 2 ; and z 1 z 2 
if and only if yfx 1 ) > V(x 2 ) for all 2-rectifiant measures. This latter 
proposition is also true if we restrict ourselves to 2-rectifiant and indepen¬ 
dent measures. We can define, with similar relations and theorems, further 
degrees of principles of transfers or rectifiance (corresponding to further 
derivatives of <p alternating in sign in the "independent” case when these 
derivatives exist), and of “isophilies" or dominance of integrals. Each 
next degree represents one step more in egalitarianism. The ultimate 
degree is x — Min, x, with “impartiality,” degree one is Pigou and Dalton's 

ls Similar properties have been used in justice theory in [1, Theorem 19], and generali¬ 
zations of this theorem were used in the theory of choice under uncertainty in [10, Chap. 
VIII], to study the convexity properties of risk- and insurance-premia, which are the 
risk-theory meaning of 7°. We recall that we require here the property to hold for all 
n’s (cf. Section II.c.). 



UNEQUAL INEQUALITIES 


111 


principle of transfers and [l]’s “rectifiance” of functions and “isophily” 
on distributions, and degree zero just is “benevolence,” i.e., V is an 
increasing function, and [3]’s “fundamental dominance” for distributions. 

Another generalization is to consider all these properties for *,’s which 
are no more unidimensional magnitudes but multidimensional vectors 
[12, 13]. 


References 

1. S.-Ch. Kolm, The optimal production of social justice, paper presented at the 
1966 conference of the International Economic Association on Public Economics 
in Biarritz. Published in the Proceedings of this conference, (“Economic Publique," 
CNRS, Paris, 1968, and “Public Economics,” Macmillan, New York). 

2. A. B. Atkinson, On the measurement of inequality, J. Economic Theory 2 (1970). 

3. S.-Ch. Kolm, "Justice et fequitfc,” CNRS, Paris, 1972. 

4. A. Ostrowski, Sur quelques applications des fonctions convexes et concaves au 
sens de I. Schur, J. Math. Pitres Appl. 31 (1952). 

5. G. H. Hardy, J.E. Littlewood, and G. P6lya, “Inequalities,” 2nded. Cambridge 
Univ. Press, Cambridge, 1952-1964. 

6. J. Karamata, Sur une indgaliti relative aux fonctions convexes. Publications 
Mathcmatiques de 1’University de Belgrade 1 (1932). 

7. G. Birkhoff, “Tres observaciones sobreel algebra lineal, Rev. UniversidadNacional 
de Tucuman , Serie A, 5 (1946), 147-151. 

8. C. Berge, “Espaces topologiques et fonctions multivoques,” Dunod, Paris, 1959. 

9. H. Dalton, The measurement of the inequality of incomes. Economic Journal 
30 (1920), 348-361. 

10. S.-Ch. Kolm, “Les choix financiers et monetaires (theorie et technique modernes),” 
Dunod, Paris, 1966. 

11. S.-Ch. Kolm, Rectifiances et dominances integrates de tous degrts, mimeographed 
CEPREMAP, Paris, 1974. 

12. S.-Ch. Kolm, More equal distributions of bundles of commodities, mimeographed, 
CEPREMAP, Paris, 1973. 

13. S.-Ch. Kolm, Multidimensional egalitarianisms, mimeographed, CEPREMAP, 
Paris, 1975. 


642 / 13 / 1-8 



JOURNAL OF ECONOMIC 


theory 13, 112-117 (1976) 


On the Existence of Cournot Equilibrium 
without Concave Profit Functions* 

John Roberts 

Graduate School of Management, Northwestern University, Evanston, Illinois 6020J 

AND 

Hugo Sonnenschein 

Department of Economics, Northwestern University, Evanston, Illinois 60201 
Received April 21, 1975; revised June 30, 1975 


This communication is concerned with the existence of equilibrium 
in Cournot’s model of oligopoly [2, Chap. VII]. This question, of course, 
has been examined often (see, e.g., [1-3]). To our knowledge, however, all 
previous treatments of the problem have assumed (either directly or 
indirectly) that the reaction curves of the firms are single-valued continuous 
functions or convex-valued, upper hemicontinuous correspondences, so 
that the Brouwer Kakutani fixed point theorem may be used. In the 
constant marginal cost case, this assumption amounts to a condition that 
marginal revenue always be decreasing. Given some regularity conditions, 
marginal revenue will be falling at any profit-maximizing output. However, 
to assume that this condition holds globally is extremely restrictive. For 
example, one can easily construct examples in which it is not met even 
though the demand arises from a single competitive consumer with 
homothetic preferences. 1 

We will consider here the case in which the price of the single homo¬ 
geneous product is given by an upper hemicontinuous correspondence of 
the total production. Although this assumption is classical, it is 
nevertheless still restrictive, since it does not allow for general equilibrium 
effects. Moreover, we also assume costless production (although this 

* We would like to thank Robert Aumann and Herve Moulin for their suggestions 
and assistance and the National Science Foundation of the United States for its financial 
support. This work was done while Roberts was a research fellow at CORE. 

1 Note that obtaining a failure of concavity places restrictions on the marginal rates 
of substitution only along three rays. 


Copyright (£1 1976 by Academic Press, Inc. 

All rights of reproduction in any form reserved. 


112 



COURNOT EQUILIBRIUM 


113 



condition can be relaxed somewhat). 2 Yet even in this very simple model, 
the reaction curves of profit-maximizing firms need not be convex valued 
or upper hemicontinuous. However, we can still establish existence of 
Cournot equilibrium by using a simple fixed-point theorem for real- 
valued correspondences which does not assume that these properties 
hold. 

Suppose there are m firms, each of which can costlessly supply the single 
homogeneous commodity in any nonnegative quantity less or equal to 
some bound B. Let y ( , i = 1,..., m denote these quantities, and normalize 
B - 1. The demand for the commodity is given by the inverse demand 
correspondence [0, m] -*■ R + . We assume that $ is a closed corre¬ 
spondence with a compact range. This market structure is called a sym¬ 
metric oligopoly. Given any aggregate output level x = v, selected 
by the other firms, the objective of firm i is to maximize profit py by the 
choice of its output y, where p = <p(x + y) and <p(z) s= max A 

Cournot equilibrium is an (n + l)-tuple (p,y x y m ) such that 

P = <pCZ?-i y<), and for each i, y, maximizes y<p(y + Yi+i Pi) on [0, 1]. 

Graphically, the situation is depicted in Fig. 1, where the relationship 
between y and p for each * = >’y is given by the correspondence 6 X , 

where 9 x (y) = <p(x + y). The level curves of profit are rectangular hyper¬ 
bolae py — c. Profit is maximized for any x at those y where 9 w (y) just 

! We can allow for costly production by all firms under a common constant returns 
to scale technology if input prices either are given competitively to the industry or 
depend only on total industry demand. This is done simply by reinterpreting the 
residual demand curve. 




114 


ROBERTS AND SONNENSCHEIN 



touches the highest level curve. Note that as x increases from x to x + 8, 
the relevant 6 X shifts left by the amount S, since d^iy) = 0(x + y + 5) = 
8*(y -h S). 

It is clear that at x = x there are exactly two values of y which maximize 
profits. Thus, the reaction curve is not convex valued. Further, it need not 
have a closed graph: As x n -*• x, the profit maximizing choices y„ may 
converge to y, where y is not profit-maximizing given x. This occurs, for 
example, when the vertical section of the inverse demand enters the region 
[0, 1] from the right. Despite these problems, we are able to prove the 
following theorem. 

Theorem. There exists a Cournot equilibrium ( p , y \ ,..., y m ), where 
}\ = ••• — y m — y, in a symmetric oligopoly. 

To prove this theorem, we will first establish a result establishing the 
existence of fixed points for a class of real-valued correspondences. We 
will then show that the reaction curves of the firms belong to this class. 
To motivate the fixed-point results, consider the problem of proving 
existence of a fixed point for a real-valued function/from the unit interval 
[0, 1] to itself. If/is not continuous, there of course need not be such a 
point. However, as inspection of Fig. 2 suggests, a fixed point will exist 
if the only discontinuities in / take the form of “upward jumps,” or, more 
formally, if / is continuous from the right [x n > x, x„ -* x implies 
/(*n) -*•/(*)]* and upper semicontinuous from the left [x n < x, x„ -*■ x 


* In fact, it is sufficient that / be lower semicontinuous from the right. 





COURNOT EQUILIBRIUM 


115 


implies limsup/(x„) < /(*)]. For a correspondence F, an appropriate 
generalization of these conditions is that F be closed from the right, 
[i.e., x„ > x, x„ x, y„ e F(x n ) and y„ -* y implies y,e F(x)] and that 
the function given by taking inf F(x) for each x be upper semicontinuous 
from the left. 


Lemma. For each x e [0, 1], let F(x) be a nonempty subset o/[0, 1], and 
let f: [0, 1] —*■ [0, 1] be defined by f(x) = inf{y | y e F(x)}. Suppose F 
is closed from the right and f is upper semicontinuous from the left. Then 
there exists x e [0, 1] such that x e F(x). 

Proof of the lemma. Let x be the smallest value of x such that there 
exists y e F(x) with y < x, and let y x belong to F(x)- Existence of such 
an x and y follows from F being closed from the right. Then x is a fixed 
point for F. To see this, note that if x = 0, then 0 < y < x = 0, while 
if x > 0, we can approach x from the left via a sequence (x n ) for which 
x„ </(*„). Then x ^ lim sup/(x„), but, since/is upper semicontinuous 
from the left, limsup/(x„) </(x) < y < x. Thus x = infF(x). But 
x > y e F(x), so x e F(x). Q.E.D. 

Let R be the reaction correspondence of any firm, i.e., R: [0, (m — 1)] -*• 
[0, 1] is defined for each x as the set of maximizers of y@(y -f x). A value 
ye[0, 1] such that y e R{(m — 1) y) will define a Cournot equilibrium 

(<p(my), y.y). Existence of such a y is equivalent to existence of 

a fixed point for the correspondence F: [0, 1 ] —► [0, 1 ] defined by 
F(x) — R((m — l)x). Since F will be nonempty valued and closed from 
the right if and only if R has these properties, while / = min F will be 
upper semicontinuous from the left if and only if r = min R is upper 
semicontinuous from the left, it is sufficient to show that R and r have 
these properties. 

That R and thus F have closed, nonempty values follows from the profit 
function, py, being continuous and the set of all (p, y) such that 
pe<P(x + y) being compact for each x. To see that R is closed from the 
right, let x„ -► x, x n > x, y„ e R(x„), y n ^y. If y$ F(x), then there 
exists y e [0, 1] such that sp(x + y) y > <p(x + y) y. Since x n > x, unless 
v — 0, the firm could have, for large enough n, chosen y„ == y + (x — x„) 
when it selected y„ . For this choice, <£(x n + y„) = <Z>(x -1- y), and thus 
vUn + y„) y„ = <p(x + y)y„~* <p(x + y) y > y(x + >.) y. But, since <P 
has a closed graph, <p(x + y) y > lim sup <p(x„ + y„) y„ . This yields 
a contradiction if y > 0, while if y — 0, then 0 = <p(x + y) y > 
7 -(x + y) y > 0, another contradiction. 

To show that the function r defined by r(x) = min R(x) is upper semi- 



116 


ROBERTS AND SONNENSCHHN 


continuous from the right, we will show that if x n -*■ x, x n < x, then 
r(x„) <. r(x) + (x - x n ). Then talcing lim sup r(x„) yields the result. 4 

Let y = r(x). For any y ^ y, <p(x + y)y > <p(x + y)y > ?(* + y)y, 
since <p > 0. Thus, fory > y, 8 > 0, (y + 8) <p(x + y) > (y + S) cp(x + y ). 
Rewrite this last inequality condition as “for any u > (y + S), 
(y + 8) <p((x - 8) + (y + 8)) S* "<p(x ~ 8 + »)•” Note that, for any x, 
if u is such that for all z ^ u, u<p(x + u ) ^ zcp(x + z), then r(x) ^ u. This 
follows since if u < r(x), then u<p(x + u) ^ r(^) <p(x + r(x)) implies 
u e R(x), a contradiction if u < 1, while the case u > 1 is trivial. Finally, 
this observation and the expression in quotation marks yield r(x — 8) < 
y 4 - 8 — r(x) 4- 8. Now, take 8 = x — x n . 

The model we have studied bears the following alternative interpretation 
(see [5]). Consider two commodities which are demanded in fixed pro¬ 
portions, say one-to-one, but which are produced by two separate firms. 
This market structure, called complementary monopoly, was also studied 
by Cournot [2, Chap. IX], who considered the example of copper and 
zinc used jointly to produce brass. It is natural to represent the demands 
for the two commodities, which must be identical at each pair of prices, 
by a correspondence [0, 2] —*• R + with a closed graph. Generically, 
we write xeV(p -f q), where x is the quantity demanded of either com¬ 
modity and p and q represent the prices of the two commodities. Then, 
a Cournot equilibrium is a triple (x, p, q ) such that px is maximal given q 
and qx is a maximal given p, where x = max p + q). Our theorem then 
provides conditions for the existence of Cournot equilibrium in situations 
of complementary monopoly. 

It should be clear that the methods used here to prove existence depend 
crucially on the assumptions that the firms are identical and that price 
depends only on total output. Given the nonconvexities that arise even 
in this simple, symmetric case, one must conjecture that in more general 
models without this special structure, equilibrium might well fail to exist. 
In fact, we have recently constructed an example of a very simple economy 
with two firms, each of which costlessly produces a single commodity, 
in which no Coumot-Chamberlin equilibrium exists (see [4]). 


References 

1. K. J. Arrow and F. H. Hahn, "General Competitive Analysis, Holden-Day, San 
Francisco/London, 1971. 

2. A. Cournot, “Recherches sur les principes mathematiques de la throne des 
richesses,” (1838). Translated by N, T. Bacon as “Researches into the Mathematical 
Principles of the Theory of Wealth,” Hafner, London, 1960. 

4 We are indebted to Herve Moulin for the following argument. 



COURNOT EQUILIBRIUM 


117 


3 C. R. Frank Jr. and R. Quandt, On the existence of Cournot equilibrium, Int. 
Econ. Rev. 4 (1963), 92-100. 

4. J. Roberts and H. Sonnenschein, On the foundations of the theory of monopolistic 

competition, Econometrica, to appear. > 

5. H. Sonnenschein, The dual of duopoly is complementary monopoly: or, two of 
Cournot’s theories are one, /. Political Econ. 16 (1968), 316-318. 



JOURNAL OF ECONOMIC 


theory 13, 118-137 (1976) 


The Optimal Growth of the Firm in a Growing Environment 

' Jacques Lesourne 

Conservatoire National des Arts el Metiers 75141 Paris, France, and 
SEMA (Metro International), 16-18, rue Barbes, 92128 Montrouge, France 

Received May 13, 1975; revised October 6, 1975 


The paper presents a growth model based on three essentia) assumptions: 
perfect knowledge of the future, descreasing returns on investment in a stagnant 
economy, appearance of new investment possibilities with the growth of the 
economy. Two cases are considered: the self-financing growth and the borrowing 
situation in the context of two management policies, the maximization of the 
discounted flow of dividends and the maximization of the growth rate. 


1. Only a few years ago, economic theory, which had devoted a 
considerable amount of research to the growth mechanism of national 
economies, began to consider the growth of the firm. Since then, several 
books have been published on the subject [1,9, 11] and have stressed 
the importance of a few preliminary notions: 

For the commodity concept—a commodity being consumed as an 
input or produced as an output—must be substituted the broader resource 
concept imported from political science. The resources include the brand 
images, the relations with the distribution network, the technical reports, 
the organizational rules, the relations between members of th? staff, the 
banking relations, etc. 

The growth of the firm is associated to a breeder mechanism since 
it consumes resources in a first phase (investment, advertising, managers 
time, etc.) and generates a greater amount of them in a second phase 
(cash-flow, corporate image, trained managers availability, etc.). 

The growth is limited by a rich variety of constraints which are not 
only financial, but also human, commercial, technical, etc. 

Several types of criteria may be selected for the study of optimal 
growth paths. Even in the simple “certain future” case, one may consider 
at one extreme the maximization of the present worth of future dividends 
and at the other extreme the maximization of the growth rate (which may 
be meaningless for the collectivity, but used in fact by top managers). 

118 


Copyright iQ 1976 by Academic Press, Inc. 

All rights of reproduction in any form reserved. 



OPTIMAL GROWTH OF THE FIRM 


119 


With these notions in mind, it is possible to study various growth 
situations differing by the knowledge about the future, the external 
environment, the selected criteria, the types of constraint, 

2. The purpose of this paper is to present a growth model of the firm 
based on three essential assumptions: 

1. There is perfect knowledge of the future. 

2. At a given time, the marginal profitability of investment is a 
decreasing function of the total amount of investment made. (This 
assumption takes into account the fact that, when the economy is stagnant, 
the investment possibilities are progressively exhausted by an investing 
firm.) 

3. As time goes on, new investment possibilities appear, and for 
a given amount of total investment, the profitability of an additional 
investment increases with time. (In other words, the growth of the economy 
opens new investment possibilities.) 

Of course, these assumptions can be criticized on economic grounds. 
For instance, the policy of the competitors is not directly introduced in 
the model though it has obviously a direct impact on the profitability of 
investment. 

Nevertheless, the model gives an acceptable description of reality for 
stable oligopoly markets. On such markets, when the economy is stagnant, 
any new investment means an increase of the firm’s share of the market. 
Frequently, the bigger is the share of the market already controlled, the 
more difficult it is to attract clients from competitors. On the contrary, 
when the economy increases, the industry market develops and all the 
main competitors have new investment possibilities. 

More serious is the fact that, in this paper, technology is, for the sake 
of simplicity, assumed constant through time. Since technical progress is 
one of the main factors which open new investment possibilities, we have 
briefly discussed at the end of the paper the way of taking it into account. 

The model has already been presented in [9] but it is interesting to 
develop a deeper analysis of it. Time t beihg a continuous parameter of 
the open-half-line (0, -foo), we shall introduce the following notations: 

/(/) will be the total net investment of the firm at time t, 
p.(t) dt, the cash-flow of the firm between t and t + dt, 
c(t ) dt, the investment expenditure between t and t + dt 
p dt, the loss of value between t and t + dt of an investment having a 
value of 1 at time t; p is assumed to be constant in time for a given type of 
investment. 



120 


JACQUES LESOURNE 


If the possible investments made by the firm belong to only one type, 
one may write: 

/(/) =- f exp[— p(t — t)] c(t) dr + exp[— pt] 7(0), (1) 

**0 

/(0) being the initial investment made by the firm (at time t = 0). 

Equation (1) means that any investment depreciates through time at 
a constant rate p. Such an assumption does not represent reality perfectly, 
but it has the advantage of corresponding to common accounting practices 
and leading to easy mathematical developments. 

Assumptions two and three will be embodied in the equations relating 
the cash-flow and the net investment: 

tf/) = A [*/)]/(/), (2) 

r(t) = exp[—£*?]/(/). (3) 

A is the average cash-flow between t and t + dt per unit of net investment 
at time t. It is different for different firms, even in the same sector of the 
economy. (In some simple growth models, A is assumed to be a constant 
and is then also equal to the marginal profitability of investment.) 

A is a function of a parameter r(i) which will be called the relative net 
investment and is introduced as follows: If J(t) denotes the total net invest¬ 
ment of the sector and if this sector, as a consequence of the growth 
of the entire economy, grows proportionally at rate a, the net investment 
of the firm represents a fraction: 7(/)/7 0 exp(at) of the total investment, J„ 
being a constant. This fraction is proportional to r(t). 

A will be supposed to be a continuous decreasing function of r(t): 

For at = 0 or for t given, A is a decreasing function of l(t), which 
corresponds to assumption two. 

For 7 given, the relative net investment decreases with t if a >0, which 
corresponds to assumption three. 

The behavior of the functions A(r) and A(r) r will be of importance for 
the discussion. A(r) will always be a decreasing function of r equal to zero 
for a finite value of r or tending to zero as r increases indefinitely. Four 
cases will be considered: 

Case I. A(r) is equal to zero for a finite value of r; A(r) r increases from 
zero to a maximum, then decreases and becomes null for a finite value or r. 

Case II. A(r) tends to zero more quickly that 1/r for r increasing 
indefinitely; A(r) r increases from zero to a maximum, then decreases and 
tends to zero as r increases indefinitely. 



OPTIMAL GROWTH OF THE FIRM 


121 


Case III. A(r) tends to zero less quickly than 1 /r for r increasing inde¬ 
finitely; A(r) r increases indefinitely with r. 

Case IV. A(r) tends to zero like 1/r for r increasing indefinitely; A(r) 
increases with r but tends to an upper limit as r increases indefinitely 
(Figs. 1, 2). 



Figure 1 Figure 2 


Let us consider the rate of return j of an additional investment of one 
unit at time r from a given curve lit) of net investment. By definition, 
j is such that the present worth of the additional investment is zero when 
discounted at rate j: 

1 = f A(r T ) + r T (dX/dr T )] exp [-pt - jt] dt (4) 

J o 

since the additional investment increases r and decreases the profitability 
beyond r of the investment made before. From (4), one deduces: 

j = Mr T ) + rfdXjdr T ) - p. (5) 

If we neglect the effect of the additional investment on the past investment, 
we may introduce the intrinsic rate of return k : 

k = A(r T ) - p. (6) 

Two criteria may be used: 

The maximization of the discounted flow of the dividends d(t) dt distri¬ 
buted by the firm between t and t + dt: 


V = 



d{t) exp (-it)dt, 


( 7 ) 





122 


JACQUES LESOURNE 


where i is the instantaneous discounting factor of the shareholders assumed 
to be constant in time. 

The maximization of the size of the firm at an horizon T, measured 
by I T • 

Two successive financial situations will be considered: 

(i) The self-financing situation in which the firm, starting with a 
capital K 0 at time 0, can only develop out of its cash-flow. 

(ii) The borrowing situation in which the firm starting with a capital 
K 0 at time 0, has the possibility to borrow at rate /' any sum smaller than 
a given fraction g of its total net investment. 


The Self-Financing Growth 

3. The cash-flow p(t) dt made during the infinitesimal interval 
(t , t + dt) is divided by the firm between the dividends d(t) dt and the gross 
investment expenditure c[t) dt: 1 

d(t) -f- c(t) — fi(r) (8) 

Equations (I), (2), (3), (4), (5) describe the system. In addition, we have 
the conditions: 

d(t) > 0, (9) 

HO) - K u , (10) 

since, to have a meaningu! problem, it is interesting to suppose that the 
firm has invested initially the whole amount of its capital K 0 . For the 
investment to be profitable, we shall assume: 

MK 0 )~p>0 - (11) 

so that, an investment equal to the initial capital has a positive intrinsic 
rate of return. 

From (8), (2), and (3), one deduces: 

d(t) — A(r)r exp(c*/) — c. (12) 

By derivation of (1) and (3) one gets: 

t — c(t) — pi — c(t) — pr exp (at), 
t — (r + otr) exp(at). 

With a certain future, it is not interesting for the firm to keep any cash. 


(13) 

(14) 



OPTIMAL GROWTH OF THE FIRM 


123 


Hence (8) may be written: 

d(t) = [A(r) r - (p + a) r - t] expfa/)- (15) 

The maximization of the present worth V of the dividends is obtained 
for the function r(t) which maximizes 

y=f (A(r)r — (p + a)r — t) exp[(oc — /)/] dt (16) 

J o 

with the constraint of nonnegativeness of dividends 

A(r) r — (p + ac) r — f > 0 (17) 

and the initial condition (10) rewritten 

r( 0)=Ao. (18) 

The maximization of the size is obtained for the function r(t) such that: 

Mr)r - (p + <*)r - f = 0 0 < t < T (19) 

with the initial condition: 

r(0) = K 0 . (20) 


Let us consider this second strategy first since the results will serve to 
explain the other one. 

a. Maximization of the Size 

4. To understand the evolution of f, let us have a look at Fig. 2 on 
which we superimpose straight lines starting from the origin with a variable 
slope (p 4- «). Two cases are possible: 

Case 1. {p + a) r > hr for allr. t is always negative, r is a decreasing 
function of t. t = 0 for r = 0 and r tends to zero for t increasing inde¬ 
finitely. Since 1 — rexp(af), the rate of growth of net investment is 
inferior to a. The condition is equivalent to: 

a > A(r) — p, -i.e., a > k(r) for all r, 
or 

<x > A(0) — p, i.e., a > k( 0). 

So, if the rate of development of opportunities is greater than the intrinsic 
rate of return (for all r), the maximum rate of growth of net investment is 
interior to a. The profitability of the firm investment is not sufficient to 



124 


JACQUES LESOURNE 


ensure, even without any distribution of dividends, a development at the 
rate * of occurence of new investment possibilities (Fig. 3). 



Fig. 3. Curves of maximum growth for different initial values of r. 

Case 2. (p + a) r is smaller, then greater than Ar. r starts from zero, 
increases to a maximum M, decreases, is equal to zero for a given value r 
of r , and becomes negative. 

This situation arises when the condition 

a < MO) — p, i.e., a < A(0) 

is satisfied. 

The shape of the curve r(f) depends on the initial value K„ for r(f). 

(i) Jf K„ > r or A(AT 0 ) < p + n, ? is always negative, but decreases 
in absolute value and tends to zero, r decreases and tends to r when t tends 
to infinity. 

(ii) If /f 0 = r or AfAT 0 ) — p + a, f is constant. The firm always 
maintains the initial level of investment. 

(iii) If AT 0 < r or A(A 0 ) > p + a, f is always positive. It may always 
decrease, or increase to a maximum and then decrease, but tends to zero 
in any case, r increases and tends to r when t tends to infinity (Fig. 4). 

We can represent the same phenomenon for a given value of K 0 and 
different values of a in the planes (r, t) and (/, t) (Figs. 5 and 6). We shall 
remark that r is such that 

A(f) - p = a, 

and r^K 0 according to a ^ MK 0 ) - p. r is a function r(a) of a. In the 
plane (/, t), the curves r = c“ are exponential curves of rate a. So, we 
observe four different types of growth: 











126 


JACQUES LESOURNE 


a. When the rate of appearance of investment opportunities is greater 
than the maximum intrinsic rate of return, the firm always develops at a 
rate inferior to *, so that r(t), its “share" in the sector tends to zero, even 
if /(/) increases indefinitely. 

b. When the rate of appearance of investment opportunities is smaller 
than the maximum intrinsic rate of return, but bigger than the intrinsic rate 
of return of the initial capital, the firm develops at a rate inferior to a, but 
which increases regularly and tends to a so that the “share” of the firm 
tends to a constant. 

c. When the rate of appearance of investment opportunities is equal 
to the intrinsic rate of return of the initial capital, the firm develops at rate a. 

d. When the rate of appearance of investment opportunities is bigger 
than the intrinsic rate of return of the initial capital, the firm develops at 
a rate superior to «, but which decreases regularly and tends to a, so that 
the “share” of the firm tends to a constant. 

b. Maximization of the Present Worth 

5. It is well known that a necessary condition for a function r(t) to 
maximize the functional 


H- f F(r,t,t)dt 

*rt 

is for that function to be a solution of the Euler's equation 
{(l('iiFj'Cr)jilt) ~ (PFjDr) -= 0. 

In the case of maximization of H under the constraint G(t) = 0, the same 
procedure is applicable if one introduces a Lagrange multiplier /(t) and 
if one applies the Eulers equation to the Lagrangian function ([4, 5, 8]) 

L(t ) - F(t) + l(t) G(t). 

Here, the Lagrangian Lit) is: 


L(t) - (A(r) r ~(p + n)r ~f) exp(a - i) t - £ e xp(« - i) t 

* WO r - (p + oi) r — f — M*], <2 |) 

the Lagrange multiplier being |(/)exp(«-i)r and u being a slack 
variable used to transform the inequality (17) into an equality. 



OPTIMAL GROWTH OF THE FIRM 


127 


Standard mathematical derivations* show that the solution is a combi¬ 
nation of two possible regimes: 

(i) a regime along which d(t) = 0. The firm does not distribute 
and follows a path of maximum growth. 

(ii) A regime along which: 

(d(Xr)ldr) - (p + i) = 0. (22) 

This relation corresponds in the plane (r, t) to a horizontal straight line 
of ordinate r*. r* is a function of i. 

Condition (22) has an easy economic interpretation: The present worth 
R of an additional investment of one unit at time r from a given curve I(t) 
of net investment is, when computed at t: 

R — f d(Ar)/dr ■ exp [—pt — it] di (23) 

-’o 

R = [d(Ar)ldr - (p + /)] 

p+i ' ( J 

So (22) means that the present worth of any additional investment is zero 
or has an internal rate of return j equal to the interest rate i. This is a well- 
known economic condition. 

Since d(Ar)ldr is, when positive, a decreasing function of r* and since 
lim rjCC dAjdr — 0, we may state that: 

(i) r* exists only if [«sf(Ar)/<//-] r _ 0 > P + i (otherwise, it is not 
interesting to invest for the rate of interest (»')); 

(ii) when it exists, r* is finite. 

In the planes (r, ?) or (/, t ), the curve r(t ) ~ r* separates two domains. 
A firm maximizing its present worth will never be in the upper one. 

A growth along the curve r(t) — r* will be called a regular optimal 
growth. Such a growth is sustainable if: 

d(t) ^ 0, or A(r*) - (p + a) = 0. (25) 

2 One obtains: dLlSii — 0, dL'du — 2 u£ exp(« — i)t. Hence the first necessary 
condition is: u( — 0. 

SLISr — (f — 1) exp(<x — i)c, d(3Lldr)[dt = l(a — *Xf — 1) + fl exp(or — l)t; 
8L/8r = [d(Xr)jdr — (p + a)](l — f) exp(a — i)t. 

Hence, the second necessary condition: (a — /Xf — 1) + ( — [d(\r)jdr — (p + a)] 
(1 — f) = 0. The two regimes correspond to u — 0 (and d(r) = 0) or ( = 0 which 
implies d(h-)/dr — (p + a) = / — a. 


642/13/1-9 



128 


JACQUES LESOURNE 


It is possible to associate to each value of r* a value <** of «: 

a *=A (2(,) 

If x > x*, the regular optimal growth is not sustainable , 
if a < a*, the regular optimal growth is sustainable. 

The optimal growth pattern of a firm starting with an invested capital 
of K„ will depend on the respective positions of K 0 , r, and r , or which is 
the same on the position of a with respect to k( 0), MlQ p, and x. 

To study a situation of practical interest, it will be assumed that the 
initial capital K 0 is not high enough to exhaust the profitable investment 
opportunities at interest rate /. In other words: 

K 0 < r*, (or A(A: 0 ) - p > «*)• ( 27 > 


So, three possibilities remain: 

r < K u < r* (which includes the case r — 0 or x > /r(0)), 

K„ <r < r*, 

K 0 < r* < r. 

If f < K 0 < r*, the firm invests at the maximum possible speed 
following a path of maximum growth, but r decreases and the firm never 
approaches the border of zero profitability. The firm has always an interest 
in postponing the distribution of dividends to infinity, since the rate of 
return / of an additional investment remains constantly above i (curve (A) 
of Figs. 7, 8). 



Figure 7 






OPTIMAL GROWTH OF THE FIRM 


129 



t 


Figure 8 

If K 0 < r < r*, the firm invests at the maximum possible speed, 
following a path of maximum growth. The firm limits the discrepancy 
between its “share” of the sector and the optimal “share,” but it cannot 
pass the limiting ratio r/r*. For the same reasons, the distribution of 
dividends is postponed to infinity (curve (B) of Figs. 7, 8). 

If K* < r* < r, the firm invests at the maximum possible speed, 
but at a certain time, it reaches the border of zero profitability of an 
additional investment. At that time: 

r(t) — r* j(t) = i. (28) 

From then on, the net investment of the firm grows at rate a and the firm 
distributes dividends growing also at rate a. The firm follows the regular 
optimal growth pattern and distributes all the remaining profits (curve (C) 
of Figs. 7, 8). 

The limiting cases r — r*,f — K 0 and K a = r* are obvious. 

In terms of <x, the study assumes: 

a* < A(Ao) - p 

and curves (A), (B), and (C) correspond, respectively, to 
a > A(0) or AfKo) — p < a < k( 0), 
a* < a < X(K 0 ) — p, 
a. < a*, 

so that the various types of curves may be interpreted in terms of rates 
of growth of the sector. Only if this rate is small enough can the firm reach 
and sustain the regular optimal growth pattern. 



130 


JACQUES LESOURNE 


The results are pretty realistic. In practical terms: 

(i) When ct > \(K 0 ) — p, the rate of growth of the sector is so high, 
that the firm cannot maintain its share in spite of the investment of all 
its profits. It is observed ip quickly-expanding young industries where the 
existing firms cannot grow fast enough to prevent the entry of new com¬ 
petitors. 

(ii) When a* < a < A(X„) - p, the firm improves its share, but 
the rate of growth of the sector is still too high for the firm to reach the 
“optimal” share. Of course, in that case, some other competitors, which 
are in a less favorable position (with a different A(r)) see their shares 
decrease. 

(iii) When n < a*, the firm can exploit fully its investment oppor¬ 
tunities and reach the share which is optimal in view of the strength and 
the possibilities of the competitors. When this optimal share is reached, the 
growth does not alter the relative position of the firm within the sector. 
The firm only maintains that position. 

6. In this model, the firm has not any choice concerning the nature 
of the investment made at time t. To introduce this possibility of choice, 
it would be necessary to introduce a variable x which would measure the 
capitalistic intensity of the projects chosen at time t. x would be a control 
variable function of t. 

The analysis will not be presented here. It shows that, during the period 
of maximum growth , the firm chooses the type with the maximum intrinsic 
rate of return, because it generates cash to sustain growth more quickly. 
When the firm approaches a state of regular optimal growth, this type tends 
to be the one which maximizes also the present worth of the marginal 
investment, while the rate of return lends to the interest rate i. 


The Borrowing Situation 

7. The firm will now have the possibility of borrowing at any time at 
interest rate /' an amount X(t) not greater than a fraction g (g sc I) of 
its net investment: 

0 < X(t) ^ gfit). (29) 

Condition (29) represents pretty well the bahavior of the banking system 
which will not lend to a firm more than a fraction of its own funds (which 
is equivalent to a fraction of its net investment if we neglect short-term 
assets and liabilities). 



OPTIMAL GROWTH OF THE FIRM 


131 


Relation (8) becomes, in an obvious manner: 

d(t) + c(t) = MO + X(t) ~ i'm , (30) 

since one has to pay the interests i'X(t) out of the cash-flow, but one has 
the possibility of spending the additional borrowed amount X(t). 

We shall only study the policy of maximization of the present worth. 
Replacing the variable X(t) by the variable x(t), 

X(t) = x(t) exp (at), 

the mathematical expression of the problem is: 3 find the functions r(t) 
and jc( 0 such that: 

[ [A(r)r — (p + a)r — t + x + (a — /')*] exp[(a — /)/] dt max (31) 
•'« 

under the constraints: 

A(r) r — (p + a) r — / & + (a — /') x > 0, (32) 

(constraint on the signs of the dividends d(t)), 

x>0, (33) 

gr - x > 0, (34) 

(constraint (29) rewritten with x and r instead of X and I), 
with the initial conditions: 

K 9 < r(0) < K 0 l(l - g), (35) 

since the initial investment is at least equal to the capital and cannot be 

greater that the sum of the capital and the maximum possible borrowing. 

We have now to maximize the functional (31) under the three constraints 
(32), (33), and (34). First we transform the inequalities into equalities by 
the introduction of the slack variables Z 2 , l/ 2 , and W*. Then we associate 
to the constraints the Lagrange multipliers: 

r) exp(a — i) t, £ exp(a — i) t, v exp(a — i) t, 

and consider the Lagrangian: 

H(t) = (A(r) r — (p -r ol) r — r + x + (a — i') x 

+ij[A(r) r — (p + a) r — t + * + (a — i') x — Z 2 ] 

+ f[x — C/ 2 ] + v[gr — x — W s ]} exp(a — /) t. (36) 

’ Obviously: X — iX = [* + (a — i')x] exp(oc/). Hence: d(t) = [A(r)r — (p + a)r + 
* + (<* — /Ox] exp(ar). 



132 


JACQUES LESOURNE 


By differentiating this function, it is shown that the extremal evolutions 
verify the conditions 4 

Ztj = {/£ = vW — 0, (37) 

t) + [A(r) r + > d\\dr -(p + OKI + ij) + vg = 0, (38) 

r, + [0 - 00 + r/) - g + v = 0, (39) 

so that there are eight possible evolutions: 


0 

II 

II 

II 

z 

II 

II 

II 

0 

O 

II 

II 

il 

z 

0 

ii 

II 

II 

O 

1! 

:l 

43 

1! 

V 

= f w = 0 

0 

11 

II 

i: 

V 

II 

II 

II 

O 


Out of them, five are possible: 6 

1. The self-financing growth (Z =- U — v = 0). In this evolution, 
the firm does not borrow and follows the path of maximum growth 
already described. It means that it is not profitable to invest if one has to 
pay an interest rate V. Naturally, along that regime 

f A(r) r — (p + a) r. (40) 

2. The regular self-financing optimal growth (77 = U — v = 0). It 
is the second type of evolution in self-financing growth. The firm does not 
borrow and maintains r at the value r* such that 

Mr*) + r* d.\\dr = p + (4d) 

which means that marginally the rate of return is equal to the interest rate 
of the shareholders i. 

* d(3L/8Z)ldt - d(dLISU)ldt =- d(dLldW)]dt = 0; 8L/8Z = Z v ; 8L/8U = l/f; 
8LI8W *= Wv. Hence the three conditions (37): Zr, = Ui =• vW = 0, d(dLjdf)jdt = 
+ v) exp(a - i)i]jdt - - [(« - 0(1 + v) + i] exp(a - /)/, 8L/8r = {(1 + ij) 
(Ar + rdXjdr — ip 4- a)] 4 vg} exp(« — i)l. Hence the condition (38): i) 4- (1 4- y) 
[Ar 4- rdXIdr — (p 4- /)] + vg — 0. Finally: 

d(8LI8x)ldt = d[( 1 4- v) exp(a - i)i]jdt =- [(« - /)(1 4 - v ) + 4 ] exp(a - i)t, 

SLjdx = {(a — |'X1 4 - 1?) 4- f — v) exp(a — i)l. 

Hence the condition (39) 1 } 4- (/' — 0(1 4- — £ 4 - v — 0. 

‘The three impossible evolutions are: Z = U W = 0 and ,, = (j = w — 0 
since the conditions V = IV = 0 imply that the debt is simultaneously equal to its 
minimum and to its maximum. 7 «= i = v ~ 0 since in that case, condition (29) reduces 
to V = 1 , which is in general untrue. 



OPTIMAL GROWTH OF THE FIRM 


133 


3. The maximum borrowing growth (Z = £ = W — 0). In this 
evolution, the borrowing is kept at the maximum level: x — gr, and 

* = Wr) r - [p + a(l - g) + i'g] r}/( 1 g). (42) 

It means that it is profitable to borrow for investment at rate i" the maxi¬ 
mum amount available. 

4. The regular borrowing optimal growth (rj = £ = W — 0). In this 
evolution, the borrowing is kept at the maximum level: x = gr and r 
is fixed at a level r** such that: 

Mr**) + r** dX/dr = p + i + W - 0g. (43) 

Relation (43) expresses that the rate of return is equal to the average 
interest paid on the money employed: i + (/' — i) g. 

5. The disengaging growth (Z = f = v = 0). In this evolution: 

0 < x < gr. (44) 

r is kept constant and equal to r*** 

A (r***) + r *** dXjdr = p + (45) 


x being defined by 

^j.***) r *** _ (p _|_ a ) r *** -f x + (« — /') x = 0. (46) 

In the disengaging evolution, the rate of return is equal to the interest 
rate on debt i' since the profits are used to get rid of the debts. In this 
regime, the borrowing declines, the firm passing from a borrowing to 
a self-financing growth. The firm only distributes dividends in evolutions 
2 and 4. 

8. It is interesting to study how the various evolutions are linked. 
A full study will not be presented here, but some comments will be of 
interest. 

We shall remark that the three values of r associated to a regular optimal 
growth verify the following inequalities: 

If i' < i, r* < r*** < r**; 

If/' > /, r*** < r** < r*. 



134 


JACQUES LESOURNE 


1. t' < i 

In the evolutions 1 and 3, the firm size tends respectively to values r 
and f (each one being positive or equal to zero) and such that: 

A(f) — p = «, (47) 

Mr) - p = <*(1 - g) + i'g, (48) 

so that: 

f sg r if a^/', (49) 

which defines two subcases: 

(a) . /' < a. It is easily seen that the rate of growth in evolution 3 
is always more than 1/(1 — g) times higher for a r given than the rate of 
growth of evolution 1. One can also show that, for i > a, the dividend 
in evolution 4: 

d(t) = [A** -(/? + «) + (a — V) g] r** exp(af) 
is higher than in evolution 2: 

d(t) = [A* — (p + a)) r* exp(af). 

So, when the rate of interest on debt is smaller than the interest rate of 
the shareholders and than the rate of growth of investment opportunities , 
the best policy for the firm is to start with a maximum borrowing growth 
(evolution 3) and to pass to the regular borrowing optima! growth (evolution 
4) if it is sustainable. 

It may be proved also that: 

r ^ r* implies f > r***. 

(Evolution 4 is sustainable if evolution 2 is sustainable.) 

(b) . i' > a. Since r > r, three situations are possible: 

r* < f and r ** < f. The two regular optimal growth patterns are 
sustainable. If we assume the validity of a simple economic reasoning, 
the best policy should remain the sequence evolution 3-evolution 4 
(maximum borrowing growth and regular borrowing optimal growth). 

r* < r and r < /■**. The regular borrowing growth is not 
sustainable while the self-financing regular growth is sustainable. The 
dividends of the borrowing growth may be higher but they are postponed 
to infinity while the other ones are distributed after a finite time. 




OPTIMAL GROWTH OF THE FIRM 


135 


r* > r and r** > f. None of the regular optimal growth patterns 
are sustainable. In both cases, the interest of the firm is to postpone 
dividends to infinity. The self-financing growth leads to a size closer to 
the optimal one. 


2. T > i 


If it is sustainable, the final evolution is always the regular self-financing 
optimal growth (evolution 2). It is never interesting to borrow, except for 
a transitory period, because the interest rate on debt /' is above the interest 
of the shareholders. 

According to the values of parameters, the succession of evolutions 
may be: 

3 —«- 5 —► 1 —► 2 
5 — 1 -> 2 
1 -*2 

In the first case (assuming r > r* and r > r***): 

(i) The firm starts with the maximum amount of borrowing and 
follows a path of maximum borrowing growth (evolution 3). 

(ii) When the firm reaches r***, it is no longer profitable to borrow 
(since i' > i and because the marginal productivity of the investment 
has decreased). The firm maintains r constant and disengages from 
borrowing (evolution 5). 

(iii) Then, it follows a path of self-financing growth, till it reaches 
the regular self-financing optimal path. Such a growth pattern is pictured 
on Fig. 9. In that case, borrowing is a transitory policy used to accelerate 
the speed at which the stable evolution is reached. 



Figure 9 




136 


JACQUES LESOURNE 


9. Even if the discussion of the borrowing case has not been completely 
developed, this model shows two important aspects of the growth policy 
of the firm: 

(i) The interaction^ between the appearance of new investment 
possibilities due to the development of the economy (and consequently 
of the sector) and the growth patterns opened to the firm. 

(ii) The succession of various evolutions, some corresponding to 
maximum growth patterns during which the firm builds its “share” of 
the economy and some corresponding to stable shares during which the 
firm distributes dividends. Depending on the interest rate of borrowing, 
this possibility may never be used, be used for transitory periods only or 
used on the total life of the firm. 

The results obtained are in agreement with the observations of real life. 

10. Let us sketch how technical progress could be introduced into 
the model. If the technical progress is exogenous, we may replace Eq. (2) 
by: 

fi(t) = Mr) I exp(ot) (50) 

where a is the differential rate of technical progress of the firm, by com¬ 
parison with its competitors. (If the absolute rate of technical progress is 
the same for all the firms of the sector, the cash-flow will probably not 
change since the price will fall, but a differential rate of technical progress 
will give to the firm the possibility to increase its cash-flow.) 

If the technical progress is endogenous, we may assume that the contri¬ 
bution to investment of an expenditure c(t) is an increasing function of r 
and replace relation (1) by: 

fit) = f exp[— p(t — t)] exp(crr) c(t) dr + exp(— pt) 7(0). (51) 

J o 

These brief comments show how important it could be to study more 
deeply the interactions between the growth of the economy, the growth 
of a given industrial sector within that economy and the growth of a given 
firm within that sector. 


References 

1. A. Bensoussan, E. G. Hurst, and B. Naslund, “Management Applications of 
Modem Control Theory,” North-Holland, Amsterdam, 1975. 

2. A. E. Bryson and Y. C, Ho, “Applied Optimal Control,” Ginn, Waltham, 
Mass., 1969. 



OPTIMAL GROWTH OF THE FIRM 


137 


3 . M. J. Gordon, “The Investment, Financing and Valuation of the Corporation,” 
R. D. Irwin, Illinois, 1962. 

4. G. Hadley and M. C. Kemp, “Variational Methods in Economics,” North- 
Holland, Amsterdam, 1971. 

5. M. D. Intriugator, “Mathematical Optimization and Economic Theory,” 
Prentice-Hall, Englewood Cliffs, N. J., 1971. 

6 . A. P. Jacquemin, “The Dynamic Analysis of Advertising Policy,” European 
Institute for advanced studies in Management, Bruxelles, 1972. 

7. H. W. Kuhn and G. P. Szego, “Mathematical System Theory and Economics,” 
Springer, Berlin, 1969. 

8 . G. Lehmann, “Optimization Techniques with Application to Aerospace Systems,” 
Academic Press, New York, 1962. 

9 . J . Lesourne, “Modules de croissance des entreprises,” Dunod, Paris, 1972. 

10. F. Lutz and V. Lutz, “The Theory of Investment of the Firm,” Princeton Univ. 
Press, Princeton, 1951. 

11. R. Marris, “The Economic Theory of Managerial Capitalism," Macmillan, 
London, 1964. 

12. P. Masse, “Le choix des investissements: critires et mdthodes,” Dunod, Paris, 1959. 

13. E. Penrose, “The theory of the growth of the firm,” Wiley, New York, 1959. 

14. A. A. Robichek and S. C. Myers, “Optimal Financing Decisions,” Prcnctice-Hall, 
Englewood Cliffs, N. J., 1965. 

15. E. Solomon, “The Theory of Financial Management,” Columbia Univ. Press, 
New York, 1963. 

16. E. Solomon, “The Management of Corporate Capital,” The Free Press, New York, 
1959. 

17. H. M. Weingartner, “Mathematical Programming and the Analysis of Capital 
Budgeting Problems," Prentice-Hall, New York, 1963. 



JOURNAL OF ECONOMIC THEORY 13, 138-153 (1976) 


Rights Exercising and a Pareto— Consistent Libertarian Claim 

Jerry S. Kelly 

Syracuse University , Syracuse. New York 13210 
Received May 27, 1975; revised February 25, 1976 

1. Introduction 

Sen’s proof [I] of the impossibility of a Paretian liberal has stimulated 
considerable work in collective choice theory [2-12]. Of this work, the 
part that is most likely to lead to a long run improvement in the way we 
deal with impossibility theorems is Gibbard's introduction of the notions 
of rights systems and rights waiving [7]. Unfortunately, Gibbard’s analysis 
is flawed by some assumptions about individual behavior which can come 
into conflict with the information and incentive structure of his model. 
It is the purpose of this paper to point out those flaws, to correct some 
of them, to show the depth of one of them and to illustrate the impact 
of these ideas on impossibility results. 


2. Framework, Sen’s Theorem 

Our interest is in two sets: the set, £, of alternatives and the set, H, 
of v individuals, with 2 •=$ v < oo. Let R be the set of all reflexive, complete 
and transitive relations on E. A society , S, is an element of Jhe v-fold 
Cartesian product, R‘. If S = (R xs ,..., R vS ), we say that R iS is individual 
i’s preference relation in S. Strict preference and indifference are defined 
in the usual way: 

xP, s y iff xR tS y and not yR iS x; 

.x/.j}’ iff xR lS y and yR lS x. 

An agenda is a subset, A, of E. A social choice function , C, assigns to an 
agenda and a society a set, C(A, S ), of “chosen” alternatives in accordance 
with the rule 

0 ^ C(A, S) C A. 

138 

Copyright © 1976 by Academic Press, Inc. 

All rights of reproduction in any form reserved. 



RIGHTS EXERCISING 


139 


Most of the conditions drawn from moral philosophy that economists 
have imposed on C are concerned with decisiveness. A set, T, of individuals 
is said to be decisive for alternative x against alternative y if, whenever 
xP tS y for ah i e T, then 


x £ A -* y i C(A, S). 

The two decisiveness conditions we will examine are 
The weak Pareto condition-. For all x and y in E, the set, H, of all indi¬ 
viduals is decisive for x against v; and 
Minimal liberalism: There is an individual ie H and two alternatives, 
x.yeE, such that {i} is decisive for x against y and for y against x; also 
there is another individual, j e H, j =£ i and two alternatives, w, z e E 
such that {/} is decisive for w against z and for z against w. 

With these preliminaries, we can state Sen’s result: 

Theorem 1. (The impossibility of a Paretian liberal.) There is no social 
choice function, C, satisfying both the weak Pareto condition and minimal 
liberalism if the domain of C contains all the elements of the Cartesian 
product {{x, y, z, tv}} X R v . 

What sets this apart from nearly all of the other three or four dozen 
impossibility theorems is its freedom from reliance on the two conditions, 
independence of irrelevant alternatives and rationality, which the moral 
philosopher sees as “merely” conditions of computational convenience 
and simplicity. 1 Sen appears to have found inconsistency between two 
basic (and presumably widely held) ethical constraints on social choice. 


3. Gibbard’s Revision 

Gibbard’s resolution of the problem posed by Sen’s theorem is to 
declare the decisive set formulation of liberalism to be an inappropriate 
representation of that philosophical position. For Gibbard, the deci¬ 
siveness condition for individual b should be decomposed into two parts: 


1 From the perspective of the welfare economist, conditions of computational sim¬ 
plicity are not so easily dismissed. Computational complexities are costly, i.e., resource¬ 
using, and must be worried over. We may not want to buy “justice” at the cost of 
starvation. The impossibility theorems using these conditions clarify the kinds of 
injustices implied by certain simplifications. What a welfare economist must find 
disappointing about those theorems is the lack of any justification that rationality and 
IIA are good representations of the resource costs of complexity, that they are the 
rules-of-thumb he wants to impose. 



140 


JERRY S. KELLY 


(i) b has a right to x over y; 

(ii) b will exercise that right whenever x is available and xP iS y. 

Accepting (i), he challanges (ii), seeing circumstances in which, if b 
exercises the right to x Qver y, there may result a social choice he likes 
less than what results when the right is not exercised. Given certain 
assumptions about what information is available to b, it is unreasonable 
to expect b to follow the rights exercising rule expressed in (ii). 

Gibbard’s revision of (ii) is an indirect one, saying that b exercises a 
right unless he waives it. This obviously requires a criterion for rights- 
waiving. Informally, b will waive his right to x over y if it gets him into 
trouble. “Trouble” in this case means ending up with an alternative at 
least as bad as y. So we let E be a finite sequence, , y 2 ,..., y^ — x , 
of alternatives. If b exercises his right to x over y and then is “forced” 
to take y„ , over x, y A . 2 over v A _,,..., over y 2 , and if vR bS y 1 , he has 
gotten into trouble. How might b be forced to take, say, over *? In 
either of two ways: 

(A) by the weak Pareto condition, if everyone strictly prefers 
y»-i to x; 

(B) by someone else exercising a right to y^ over x. 

Now with (B), we seem to have gotten into a circularity, defining exercising 
in terms of waiving which now is defined in turn in terms of exercising. 
Gibbard avoids this circularity by the trick of assuming that the 
“exercising” in (B) proceeds in accordance with the classical rule, (ii). 
All that is now needed to express the revised rule is the adoption of the 
following convention: a rights-system is a set C E x E X H; if 
(x, _v, b) e 9t, we say ?A attributes to person b the “right" to alternative 
x over alternative y. Thus we have: 

(ii-0 If <*, by e &t, b will exercise that right if xP bS y and if b 
does not waive the right, b waives (x, y, by if there is a finite sequence, 
£ = , y 2 ,v A such that: 

(a) v A =• x; 

(b) yR^yi ; 

(c) for every / — 1 , 2,..., A — 1 , either 

(Ve) y t P fS y l+1 , or 
(3e)[e b& (y { , y i+1 , e)e&& 

The remainder of this paper is a series of observations on and criticisms 
of this rule together with some attempted improvements. 



RIGHTS EXERCISING 


141 


4. First Observations 

An easy early remark on Rule (ii-1) is that it places very heavy demands 
on the information structure. Not only must each individual know all 
of his rights as well as his own preference ordering, he must know the 
preference orderings of all other individuals and must know all rights 
assignments. He must, in addition, be able to carry out evaluations of £ 
sequences of arbitrary length. Now one could imagine a central agency 
to which each individual submits his preference ordering and his rights- 
exercising strategy and while this reduces strain on the information trans¬ 
feral system it increases individual’s calculation problems on rights 
exercising (for the strategy will have to describe exercising rules contingent 
on each of the possible sets of others’ preferences). And, of course, there 
is the obvious problem of providing the agency with incentives to honestly 
amalgamate preferences and rights-exercising strategies and individuals 
with incentives to submit true preferences [on the latter, see 13-18]. These 
are important problems that deserve more attention than they will receive 
in this paper which deals with choosing exercising rules in the presence 
of a free and perfect information structure. 

A second easy observation is that this is an extremely cautious, risk 
averse criterion for rights-exercising. There may be may E sequences 
starting from x that end up in a j’j which is vastly preferred to y. But if there 
is just one ending in a no-better-than-y alternative, the right is to be waived. 
The moves along the E sequence are out of b's control and b is taken to 
worry about and act upon the worst eventualities. The appropriateness 
of such caution depends upon the specific interpretation of “alternative” 
and “individual” and on what rights are assigned. At the level of generality 
of Gibbard’s paper, little can be decided about appropriateness. This 
paper will continue the discussion with a flavor of cautious exercising. 

Now we'proceed to two minor technical issues concerning Rule (ii-1). 
First, b need not worry about being forced to take y^ if y A _ a is not 
available, i.e., if y A _, is not in the agenda, A. Gibbard is not very careful 
about this issue of availability. While we shall revise (ii-1) to take account 
of A , it should be pointed out that this alone would have no impact on 
Gibbard’s results. 

The second issue is that in forcing the move from y to x by exercising 
<x, y, by, b doesn’t seem to have gotten into trouble if he is forced in the 
end to take a y x where he is indifferent between y x and y. Waiving might be 
appropriate for a cautious exerciser if yP bS yi for some E, but not if only 
vR tlS y l as in (b) of (ii-1). There is evidence in one of Gibbard’s examples 
of his recognition of this problem. The example involves an agenda of 
three alternatives: 



142 


JERRY S. KELLY 


w E : Edwin weds Angelina; 

Wj: the judge weds Angelina and Edwin remains single; 
w 0 : both Edwin and Angelina remain single. 

The rights system includes both of 

(Wj , Hq , A) (“Angelina has a right to marry the willing judge.”) 

<w 0 , w E , E > (“Edwin has the right to remain single”). 

Angelina prefers w E to Wj to w 0 while Edwin prefers w„ to w E to w ,. 
Using the classical rule, h> 0 is eliminated when Angelina exercises her 
right to Wj over w 0 ; w E is eliminated when Edwin exercises his right to 
w 0 over w E ; wj is eliminated in favor of w t on the weak Pareto condition. 
Gibbard suggests that this dilemma be solved by Edwin waiving his right 
to w’ 0 over w E . For there is a £ -- w } , w n with w E P ES Wj and 
(wj , w a , A; e St. By (ii-1), “It may be to Edwin’s advantage to waive 
his right to w„ over w E in favor of the Pareto principle.” But we can simi¬ 
larly analyze Angelina. For there is a £ — w 0 , , e E with w’ 0 /?^w 0 , 

w E Pareto superior to Wj and <w„, w E , E) e 9t. Thus on rule (ii-1), we 
would have Angelina waive her right to Wj over w 0 . This is clearly a terrible 
result; each waives on the incorrect belief that the other will exercise, and 
each has enough information to know that, if (ii-1) is followed by all that 
the belief is incorrect. The analysis depends on each individual making 
a correctable error. Gibbard skirts this by amending (ii-1) so as to require 
>’i ^ y. But why rule out only those >i’s that are indifferent to y by virtue 
of identity? The whole collective choice literature has carefully avoided 
separating identity-indifference and just plain indifference. Such separating 
can also be avoided here by the natural expedient of altering part (b) 
of (ii-1) to require strict preference. 

These two technical issues are incorporated in (ii-2): If (x,y, b} e3f, 
b will exercise that right at (A, S) if xP hs y and if b does not waive that right 
at (A, S). b waives \X,y,b > at M, S) if there is a finite sequence, 
£ = Ti. Ta..... Ta of alternatives in A such that: 

(a) y A = x; 

0 >) y p t S yi ; 

(c) lor every i = 1, 2,..., A - 1, either 

{Ve)y i P tS y i+1 , or 

(3e)[e ^ b&(y t , y M , <?> e i# & y,P eS y i+1 ]. 

Again we should point out that these changes alone have no impact on 
Gibbard’s results. 



RIGHTS EXERCISING 


143 


5. Sequence Extensions 

Gibbard has shown us that the classical rights-excercising rule is 
unacceptable because it is possible to construct examples in which it seems 
silly for b to follow that rule. We now begin our task of constructing 
examples in which it seems silly for b to follow rule (ii-2). 

Consider the agenda, {w, x, y, z), over which individuals b and e have the 
preferences 

b e 

x y 

w w 

y z 

Z X. 

(In this table, is-higher-than corresponds to “is strictly preferred to.”) 
The rights system is 01 = {<*, y, by, < z, x, e>} and we ask if b should 
exercise or waive <jc, y, by. Let Z be the sequence: z, x. If b considers 
exercising his right to x over y and expects e (who prefers z to x) to exercise 
<z, x, e'y, he will run the risk of getting z which he likes less than y. By 
(ii-2), he should waive <x, y, by. But this quite simply ignores the fact 
that Z = z, x is part of an extended sequence of alternatives from A, 
namely: w, z, x. And, although e's expected exercising of <z, x, e> yields 
the undesirable z, the weak Pareto condition would have k' eliminate z 
and b prefers w to y. In effect the sequence Z' = w, z repairs the trouble 
engendered by Z = z, x. This example suggests the following revision: 

(ii-3). If <x, y, by e3f, b will exercise that right at (A, S) if xP bS y 
and if b does not waive that right at (A , S). b waives <x, y, by at {A, S) 
if there is a finite sequence, Z = y x , y 2 y h of alternatives in A such that: 

(a) y x = x; 

(b) yPbsyi ; 

(c) for every i = 1, 2,..., A — 1, either 

(Ve)y,P, s y t+1 , or 

(3e)[e ^ b&(y t , y i+1 , e> e & yiP eS yi + J; 

(d) for every finite sequence, Z' = z 1 , z t z A ' of alternatives 
in A such that 

(i) z y = y 1 ; 

(ii) z x P bs y\ 


^4z/i3/i-io 



144 


JERRY S. KELLY 


(iii) for every i = 1, 2,..., A' — 1, either 

(Ve) z,P eS z M , or 

(3e)[< , .Vi + 1, e) e # & y t P, s yM\, 

there is a sequence, S" = h’, , w 2 wy of alternatives in A such that 

(iv) wy z a ; 

(v) J'/’mM', ; 

(iv) for every i — 1, 2,..., A" — 1, either 
(Ve) or 

(3e)[e / , M', +a ,e)e«& H’,/ , eJ w <+1 ]. 

In this rule, part (d) says that any extending sequence, S', that repairs S 
in the eyes of b can enter some other, out-of-control sequence, S", resulting 
in an alternative worse than y. Note that b can take part in forcing moves 
in the repairing sequence; we don’t require that the exercise of rights 
(by the classical rule) be done only by individuals other than b. 

Two remarks are worth making on this latest revision. First, it causes 
no significant changes in the theorems that make up Gibbard’s libertarian 
claim. Second, the revision could be improved if we assume individuals 
also have information about the choice function. For S sequences leading 
to a y t worse than y can be “repaired” directly if the choice function always 
excludes j’j from C(A, S). The weak Pareto condition and rights exercising 
rules are only sufficient, not necessary, conditions for excluding alter¬ 
natives. Such a change would not affect what follows in this paper. 


6. Refinements of the Sequence Concept 

In examining a sequence like y x , y 2 ,...,y A , we have supposed that a 
move from v, +J to y, can be forced only by either the weak Pareto con¬ 
dition or by a single individual exercising a single rights triple 
It is useful to expand our concern to sets of individuals and sets of rights 
triples. 

The first expansion can be illustrated by the Angelina-Edwin-Judge 
example discussed here in Section 4. Gibbard says [7, p. 398], “Angelina 
has a right to marry the willing judge instead of remaining single.” The 
introduction of a concept of “willingness” seems quite unnecessary. What 
we want to say in this case is that the set of individuals, {Angelina, judge}, 
has a right to being married to each other rather than each remaining 



RIGHTS EXERCISING 


145 


single, that this right can be waived by either alone, but that the judge 
prefers marriage to Angelina over bachelordom and that Angelina 
correctly believes the judge will not waive their joint right. This exercise 
of the right by {Angelina, judge} in this case depends only on Angelina’s 
waiving decision. 

This example is an archetype for all contractual rights. We will hence¬ 
forth change the notion of a rights system to a subset of 

E x E x (2* - { 0 }) 

where 2 H is the power set of H, the set of all subsets of H. A triple, <x, y, T'y 
will be exercised (so that x e A -> y $ C(A, S)) only if every member of T 
exercises that right. When we are working with a subset consisting of a 
single individual, b, we will, without cause for confusion, continue to 
write (x, y, by rather than <x, y, {6}>. 

The reason for worrying about sets of rights triples at a single forcing 
step in a sequence is illustrated by Gibbard’s other example. Suppose a 
two-issue space, characterized by the white or yellow color of person l’s 
room and the white or yellow color of person 2's room. Each has a “right 
over own room color,’’ i.e., .3? contains 


<(H’, w), (y, w), 1> 
<(w, y), (y, y), 1> 
<(>’, w), (w, w), l> 

<( y, t), (w, y ), l > 


<(w, w), (tv, y), 2> 
((w, y), ( w , w), 2> 
<(y, » ’),(y,y), 2 > 
<(y,y), (y, w), 2>. 


We are currently in state (y, w), person 1 decides to exercise 
<(tv, w), (y, tv), 1> and person 2 decides to exercise ((y, y), (y, w), 2). 
What happens? The natural answer is (tv, y). It seems we should worry 
about a step from >’, +1 to >’,■ being forced as a consequence of the simul¬ 
taneous joint exercising of several rights over y U \ by several (sets of) 
individuals. We need a kind of production function where the “inputs” 
are a current state and a set of rights triples with second component equal 
to the current state and where the “output” is a new state. We will assume 
that this function is fixed, independent of A and S and will represent the 
function by expressions like 

"y t is the consequence of exercising <y (1 , y t+1 , S (1 >,... 

<yaU) > Ti +1 » Sik«)> simultaneously.” 

With those preliminaries, the final revision of Gibbard’s rule we shall 
present is: 



146 


JERRY S. KELLY 


(ii-4). If <jc, y, T) e St with be T, b will exercise that right at (X, S) 
if xP b .y and if b does not waive that riaht at (X, S). b waives <x, y, T') at 
(X, S) if there is a finite sequence, £ = y t , y* y» of alternatives in X 
such that: 

(a) (35,, S t S k , x t , x t jc*)«jc, , y, 5,) e £)(x s , y, 5,) e 
01,..., <x t , y, S^e St re S, — x,P rS y for all i and y x is the consequence of 
exercising (x,y, T >. <x ,, y, 5,) - and <x*., >\ 5 k > simultaneously); 

(b) yP bS yi ; 

(c) for every /' = 1, 2,..., A — 1, either 

(Ve) y,P, s yi+i, or 

r Hi) 

(3S*!, S t 2 S^i)) I b $ & (3v*i , »•••» 

L )-i 

x , Vn X , 5, x > G > y*+l y ‘^ 2 / > 6 » J\+l * 

e St & [r e 5 (l -*■ &■■■& [re 5 a «) — Mw/Vstt+i] 

&,v, is the consequence of exercising <y (1 , y i+1 ,5„> ••• 
x <y,kU), yu i, Saw/ simultaneously.)j ; 

(d) For every finite sequence, £’ — z,, z„z A - of alternatives 
in A such that 

(i) **' = y; 

(ii) ZiPbsy, 

(iii) for every i — 1, 2,.,., A — 1, either 

(Ve)z,P fS z ul , or 

f*l * ^"i2 » ^ifc'(i) , z fl , z 12 >•••» z ifc'U>) ~ 

X i z i+l i 7'<l/ > e St (Z,,;'(() , Z, +1 , T ik -( f )y 
e St & [r e 7), —► z tl P r5 2 <4 .j] & & [r g Tjt'U) z ik’U)Prs^n-i] & z < 

is the consequence of exercising 

< z u , z >+i, To) •” <Zik'U), z t+i» 7V<o> simultaneously), 
there is a sequence, •£" = w,, w 2 w A « of alternatives in X such that: 

(iv) MV = z,; 

(v) y/’jjir,; 

(vi) for every i — 1, 2,..., A’ — 1, either 

(VeXw.P^MVu) or 



RIGHTS EXERCISING 


147 


(3Un , Ui t U ik ‘( t ), w a , H’jj Wuc’U)) 

j *'«> 

x ib e (J U { , & (w (1 , w< +1 , Ui r ) e St <»-,*'«) ,*»V+i, #«-«>> e « 
\ j-i 

& [r e t/ n w tl P rS w M ] & •• • & [r e — H ifc '( < )/ > r5Vt <+ i] 

& vv< is the consequence of exercising <w n , mv+i , t/<i> ••• 
x <vt’i, M’, +1 , U t *-(*)> simultaneously^. 

Using this version of rights exercising, it is still possible to carry out the 
proof of Gibbard’s Pareto-consistent libertarian claim. 


7. Correctable Miscalculations 

Up to this point our criticisms of exercising rule (ii-1) have fallen into 
two classes: those for which there is a revision allowing Gibbard’s analysis 
to go through and those, like costless information and caution, that we 
write off as future research lines. Now we take up a problem that seriously 
affects Gibbard’s analysis. The easiest way to get into this is to go back 
and examine the Angelina-Edwin-Judge example of Section 4. There we 
saw that there are situations in which each individual might waive a right 
because of miscalculation, correctable with available information, that 
the other was going to exercise his right. For that specific example, the 
problem can be avoided by a natural revision to rule (ii-2) in which rights 
are waived only if the y\ of a Z sequence is strictly worse than v (not just 
if it is no better). But does this adjustment eliminate all cases of correctable 
miscalculation ? 

That correctable miscalculation is to be expected arises from the asym¬ 
metry of our rules. Even when everyone is following, say, rule (ii-4), it is 
part of that exercising rule that everyone is erroneously assuming that 
everyone else is following the classical exercising rule (ii). Consider 
C({w, x, y, z}, 5) where 5 has preference orderings for the three individuals 
as follows: 


1 

2 

3 

y 

z 

X 

X 

w 

z 

w 

X 

w 

z 

y 

y 



148 


JERRY S. KELLY 


The rights system will be 

(A = {<y, 2 , 1>, < 2 , x, 2>, (x, w, 3>}. 

Now rule (ii-4) would have 3 calculate as follows: “Suppose I exercise 
<x, w, 3>. Then, 2, who prefers 2 to x, would exercise < 2 , x, 2> and then 1, 
who prefers y to 2 , would exercise <j\ 2 , 1>. There is no (repairing) 
extension of E -= y, 2 , x since no one has rights over y and no alternative 
in the agenda is Pareto-superior to y. The E sequence would yield for me 
an alternative, v, strictly worse than w. Thus I should waive <x, rv, 3).” 
But this is clearly a miscalculation. Individual 2 would not exercise 
< 2 , x, 2> if he were also following rule (ii-4). The sequence E — y, z 
together with 2 , 1> e and vP xs z is 2’s rationale for waiving < 2 , x, 2>. 
In assuming 3 to be following rule (ii-4) we are assuming 3 has all the 
information about S and dt and so has the information necessary to see that 2 
could not reasonably be expected to behave in the way rule (ii-4) tells 3 
to expect 2 to behave. The miscalculation is correctable. 

But the mere existence of such an example is not important. The 
significant question for social choice theory is how serious the phenom¬ 
enon is for Gibbard’s Pareto-consistent-libertarian possibility theorem 
and its proof. To answer that question, we must introduce some new 
terminology. A choice function, C, realizes d? iff for every pair (A, S ) 
in the domain of C, for every be H and x, v 6 £, we have 

x e A -» v $ C(A, S ) 

whenever b exercises <x, v, b] at (A, S ). C accords b an alienable right 
to x over y iff there is at least one rights-system (If such that C realizes JP 
and \'x, y, b; e , J A. E is assumed to have the structure of a p-Md Cartesian 
product of sets of “feature-alternatives": 

E - M x x M 2 X ••• X M u 

where \M t \>2 for all /. If x == (x t , x 2 ,..., x u ) and y = {yi, y u ) 
are two elements of E. they are /-variants iff 

( v <)[' f-j-* x, - v,-], 

i.e., x and y differ from each other only in theiryth feature. All this enables 
us to present Gibbard’s possibility result: 

T HEOREM 2. If p > v, then there is a choice function which satisfies 
both the weak Pareto condition and the libertarian condition : for every 



RIGHTS EXERCISING 


149 


be H, there is a j, 1 < j ^ p, such that for every pair of j-variants x and y, 
C accords b an alienable right to x over y. 

The effect our introduction of correctable miscalculations has on that 
result is to be seen in Theorem 3 below. But first we will present more 
terminology. We must describe the production function telling us the 
consequence of simultaneous rights exercising (which appears naturally 
in y-variant cases of the sort implied by Gibbard’s theorem). We wish to 
work with cases like the white room, yellow room of Section 6, where 
the production function to be used seemed obvious. When both 
<(w, w), (y, vv), 1) and <(y, y), (y, w), 2> are exercised at (y, w), we expect 
the consequence to be (w, y). But it is not obvious what happens if, say, 
both of <(>’, y), (w, vv), 1> and <(g, g), (w, w), 2> are exercised; are both 
rooms yellow via 1 or both green via 2? Or what other combination? 
The production function will not be obvious unless the rights system is 
fairly simple, eliminating the kind of conflict illustrated in the yellow-green 
example. We will focus on such a simple kind of system in a way that ties 
in very closely with the y'-variant idea. A rights system, St, will be called 
regular if with each nonempty subset, T, of H we associate a set I(T) of 
integers such that if <x, y, T > £ St, then x and y differ only in features, 
M, , where i e I(T), and furthermore the following rule prevails: 

7i ^ T t -*■ [If both <x, y, T) and <2, y, T are in St, then 
/' e /(7\) n I(T a ) implies that the /th components of x and z are identical]. 

For regular rights systems we define the natural production function: 
^ \*i. y, 7Y>, (x 2 , y, r 2 ),..., (x k , y, T k ) are exercised simultaneously, 
the consequence is z whose /th component is 

if / i U TO, 

i-i 

if i e I(Tj). 

Finally, we must settle a problem about agenda. What might we say about 
b's exercise of <x, y, h> at (A, S) if <r, y, e) e it and the consequence 
of exercising these rights simultaneously is an alternative not in A ? 
At such an abstract level, this question is impossible to answer. Here 
we resolve this kind of dilemma by working with choice functions and 
rights systems satisfying the following condition: 

A choice function, C, and rights system, St, are agenda closed with 
respect to a production function if every agenda A in domain of C has 
the property that if xe A and z is the consequence of the simultaneous 
exercise of rights over x then z e A. 




150 


JERRY S. KELLY 


Theorem 3. There are sets E and H , with (t > v, such that for every 
nontrivial choice function C and rights system 3t, if 

(i) C satisfies the weak Pareto condition; 

(ii) C satisfies the' Libertarian condition via 

(iii) is regular ; 

(iv) the production function is the natural one; 

(v) C and 01 are agenda closed with respect to the production function; 
then there is an agenda A and society S in the domain of C and an individual , 
b, such that b makes a correctable miscalculation at (A, S ) if everyone 
follows rule (ii-4) 

Proof Let E ~ {*, ,x 2 ,x 3 ) x { y 1 , y 2 , y 3 } x {z x , z 2 , z 3 } and 
H — { 1,2, 3}. Without loss of generality, assume that the assignment of 
features to individuals of the libertarian condition is the identity map; 
C accords b an alienable right to x over y if x and y are 6-variants. Thus 
the rights system 01 that C realizes must contain <x, y, b} if x and y are 
6-variants. 0i can contain no other triples. For suppose & contains a 
triple with third element T. I(T) must be nonempty. Without loss of 
generality suppose 1 e I(T) and (x, y, S) e 91 where x and y differ in at least 
their first feature. But <z, y, 1> e 9> where z has first component different 
from either xorj' (here is where we need three versions, for each feature). 
Thus regularity requires T = {1}. It is then easily shown that regularity 
must imply that x and y are 1-variants. So & contains <x, y, T > iff T—{b} 
and x and y are 6-variants. 

Now consider A -- E (which must be in the domain of C by nontriviality 
and the closure condition) and S given by 


1 2 3 


(Xx 

. Vi, 

Zj) 

(x.. 

>’l 

Zl) 

(x 2 

)’t , Zx) 

(Xx 

- .t'l , 

2 2 ) 

( x 2 

V 2 

Zx) 

(Xx 

>’2 , Zx) 

(*1 

» >8 * 

Zx) 

(x x 

J’l 

Zl) 

(x 2 

>’ 2 , z 2 ) 

(Xx 

,}’2> 

2 2) 

(xx 

yt 

Zx) 

(Xx 

yx , Zx) 

(*1 

> y\ > 

Z 3 ) 

(x 3 

yx 

Zl) 

(Xx 

>’1 . z 2 ) 

(*1 

* 3*2 * 

z 3 ) 

(x 2 

yx 

z 2 ) 

(x 3 

y 3 . zx) 

(*1 

! Vs « 

Z\) 

(x 2 

y\ 

z 3 ) 

(x 3 

}'t , z 2 ) 

(*1 

, >3, 

Zt) 

(Xx 

yx 

z 2 ) 

(Xx 

>’ 2 . z 2 ) 

(Xx 

i 3*3 * 

z 3 ) 

(Xx 

yx 

z 3 ) 

(x 2 

yx . zx) 

(*2 

.J’2. 

Zt) 

(x 3 , 

yx 

z 2 ) 

(x 2 

yx - z 2 ) 

(x 3 


Zx) 

(X 3 , 

yx 

Zt) 

(Xx 

y 3 . zi) 

(X 2 

• y 2 . 

Zt) 

(Xx, 

y% 

Zt) 

(Xx 

y 3 , z 2 ) 

(x 2 

,yi. 

Zt) 

(Xx , 

y 2 

Z 3 ) 

(x 3 

yx . zi) 




RIGHTS EXERCISING 


151 



Zt) 

(Xt, 

y% . z t ) 

(x 3 , yi , z t ) 

(**, y» 

Zi) 

( Xt , 

y a , z t ) 

(xt , y 3 , zi) 

(** > y 3 

Zi) 

(x 3 . 

y% . zi) 

(x t ,y 3 ,zt) 

(x t , y» 

z 3 ) 

(x 3 

y% , z 3 ) 

(x 3 , y 3 , zi) 

(Xi, y y 

Zl) 

( X 3 , 

y t , z 3 ) 

(■^s . y *. z t ) 

(x 3 , >’l 

Zl) 

(Xi , 

y 3 , zi) 

(X! , y x , z 3 ) 

(x 3 . yi 

z t ) 

(*1, 

y a . z 2 ) 

(xi , y t , z 3 ) 

(x 3 , yi 

z 3 ) 

(*1. 

y 3 , z 3 ) 

K , y 3 , z 3 ) 

(x 3 , y 2 

Zl) 

(Xi, 

y 3 , zi) 

(x 2 , yi . z 3 ) 

(x 3 , y 2 

Zt) 

( Xt , 

y 3 , zt) 

(x s , y t , z 3 ) 

(x 3 , y 2 

Z 3 ) 

(x 2 , 

y 3 , z 3 ) 

(xt , y 3 , z 3 ) 

(x 3 1 3-3 

Zl) 

(* 3 > 

y 3 , zi) 

(x 3 , y x , z 3 ) 

(x 3 , >'3 

Zi) 

(*3» 

y 3 , z 2 ) 

(x 3 , y 2 , z 3 ) 

(*s . y 3 

Z 3 ) 

Us. 

J's , z 3 ) 

(x 3 , y 3 > z 3 ). 


Among the Pareto-optimal alternatives are (x 2 , y t , z x ) and (x 2 , y 2 , z 2 ). 
Should 3 exercise <(x 2 , y t , _fi), (x 2 , y 2 , z,), 3) ? Consider the sequence 
E — (Xi, Vj, Zj), (x 2 , y 1 , z,), (x 2 , y 2 , z,). If 3 follows rule (ii-4) he will 
note that 2 prefers (* 2 , y x , z x ) over (x 2 , y 2 , z x ) and 1 prefers (zc x , y,, z x ) 
to (jc 2 , y x , Zj). The result, (jc, , y,, z,), is strictly worse in the eyes of 
3 to (x 2 , y 2 , z 2 ). There is no repairing sequence since no invocation of 
either the weak Pareto condition or rights-exercising (via the classical 
rule) would lead to a change in the first component away from x y , and 
all alternatives with first component jc, are strictly worse in the eyes of 3 
to (* 2 , y y , z x ) except (x y , y 2 , z y )\ but 2 would change this to (x y , y y , z y ) 
which is worse than ( x 2 , y 2 , z 2 ). Thus 3, following rule (ii-4) should 
waive <(x 2 , y 2 , z 2 ), (ar x , y t , z y ), 3>. 

We now show that the waiving of <(x 2 , y 2 , z 2 ), (x y , y y , z y ), 3> is a 
correctable miscalculation. The main point is simply that 2, following 
(ii-4), would not exercise <(x 2 , y,, Zj), (x 2 , y 2 , z x ), 2>. For the sequence 
E — (x x , y x , Zj), (* 2 , ^ , z x ), with 1 exercising his right to (x t , y,, z y ) 
over (x 2 , y y , z y ), yields an alternative (x y , y x , z y ) strictly worse in the 
eyes of 2 to (x 2 , y 2 , z y ). It is easily seen that there is no repairing sequence. 
Thus 2 would waive <(jc 2 , >’j , zj, (x 2 , y 2 , z x ), 2>. 

Finally, a check will show that there is no other sequence, E, leading 3 
to waive <(x 2 , y 2 , z,), (x 2 , y 2 , Zi), 3> via (ii-4) that does not also involve 
a correctable miscalculation. | 

One observation on this proof is useful. The elaborate example 
developed involved, for each individual, only unconditional preferences. 
Thus no return to the conditional-unconditional distinction in the early 
part of Gibbard’s paper can be invoked to save us from correctable mis¬ 
calculations. 




152 


JERRY S. KELLY 


While I have presented Theorem 3 as a criticism of rights exercising 
rule (ii-4), it might also be seen as a criticism of confining our attention 
only to regular rights systems. But one must be careful here; regularity 
of i# is only sufficient and not necessary. The gist of the proof can go 
through for some non-regular ^ if we constrain what rights can be 
exercised simultaneously. 


8. Conclusion 

While economists’ models are probably somewhat remiss in their 
failure to take into account mistakes, miscalculations, and other avoidable 
errors, 1 do not think anyone would want to base a libertarian moral 
philosophy on the assumption that there is a certain kind of error that 
all people will repeatedly make. Rule (ii-4) is a very weak basis for a moral 
philosophy. 

But it is vitally important to see that this does not simply eradicate 
Gibbard’s work and return us to Sen’s Paretian liberal problem. Gibbard’s 
paramount contribution is the decomposition of a decisiveness condition 
(Sen’s liberalism) into rithts-existence and rights-exercising. He used this 
framework to see the “impossibility of a Paretian liberal” as a critique 
of the classical exercising rule that Sen used. Gibbard’s revised rule and 
our revisions of his revision are all unacceptable, but so is the classical 
rule. It is Gibbard who has gotten us to ask the right questions. 

The moral of all this is that all decisiveness conditions (not just 
liberalism) should be decomposed into rights-existence and rights- 
excercising. The next step is to try to compose rational rights-exercising 
rules for individuals (and in a way that pays attention to the realities 
of information storage and processing limits). Note that the Pareto 
condition is one of the decisiveness conditions to be decomposed. We 
must ask if sometimes it ought not to be exercised. 

References 

1. A. K. Sen, The impossibility of a Paretian liberal, J. Political Econ. 78 (1970). 
152-157. 

2. Y. K. No, The possibility of a Paretian liberal: Impossibility theorems and cardinal 
utility, J. Political Econ. 79 (1971), 1397-1402. 

3. C. Hilunoer and V. Lapham, The impossibility of a Paretian liberal: Comment 
by two who are unreconstructed, /. Political Econ. 79 (1971), 1403-1405. 

4. A. K. Sen, The impossibility of a Paretian liberal: Reply, J. Political Econ. 79 
(1971), 1406-1407. 

5. R. N. Batra and P. K. Pattanaik, On some suggestions for having non-binary 
social choice functions, Theory and Decision 3 (1972), 1-11. 



RIGHTS EXERCISING 


153 


6. V. S. Ramachandra, Liberalism, non-binary choice and Pareto Principle, Theory 
and Decision 3 (1972), 49-54. 

7. A. Gibbard, A Pareto-consistent libertarian claim, J. Econ. Theory 1 (1974), 

388-410. f 

8. J. S. Kelly, The impossibility of a just liberal, unpublished manuscript, 1974. 

9. R. Gardner, The logic of the liberal paradox, unpublished manuscript, 1974. 

10. B. J. Fine, Individual liberalism in a Paretian society, J. Political Econ., to Appear. 

11. B. J. Fine, Interdependent preferences and liberalism in a Paretian society, Birlcbeck 
College, University of London, Discussion Paper No. 15 (1974). 

12. A. T. Peacock and C. K. Rowley, Pareto optimality and the political economy 
of liberalism, J. Political Econ. 80 (1972), 476-490. 

13. A. Gibbard, Manipulation of voting schemes: a general result, Econometrica 41 
(1973), 587-601. 

14. M. Satterthwaite, “The Existence of a Strategy Proof Voting Procedure: A Topic 
in Social Choice Theory,” Ph. D. dissertation, University of Wisconsin, Madison, 
1973. 

15. D. Schmeidler and H. Sonnenschein, The possibility of a cheat proof social choice 
function: a theorem of A. Gibbard and M. Satterthwaite, Northwestern University 
Discussion Paper No. 89 (May 1974). 

16. R. Gardner, Some implications of the Gibbard Satterthwaite theorem, un¬ 
published manuscript, 1974. 

17. G. Richardson, Information and the manipulation of social choice mechanisms, 
unpublished manuscript, 1974. 

18. E. A. Pa/nf.r and E. Wesley, Infinite voters and the possibility of a cheatproof 
social choice function, unpublished manuscript, 1974. 



JOURNAL OF ECONOMIC THEORY 13, 154-167 (1976) 


Notes, Comments, and Letters to the Editor 

The Effect on Optimal Consumption of Increased Uncertainty 
in Labor Income in the Multiperiod Case* 


We consider a multiperiod, additive utility, optima) consumption model with 
a riskless investment and a stochastic labor income. The main result is that for 
utility functions belonging to the set F, consumption decreases when we go from 
any sequence of distribution functions representing labor income to a more 
risky sequence. A concave utility function belongs to F if its first derivative 
exists everywhere and is convex. 


1. Introduction 

The impact on consumption of increased uncertainty in future labor 
or capital income has been examined by a number of authors in the last 
ten years. As illustrated by Sandmo [18], the answers one gets are different 
in the two cases of uncertain labor income and uncertain capital income. 
Therefore, in order to separate these effects, the models with random labor 
income generally have one nonrisky investment opportunity, and those 
with random investment opportunities have a deterministic (or zero) 
labor income. The model in this paper conforms to the above dichotomy. 
The only exception to that rule seems to be in Merton [11, Sect. 8], 
which treates the case of a nondecreasing Poisson income stream, an 
exponential utility function, and two investment opportunities, one 
riskless and the other described by Brownian motion. 

Three relatively early papers which examine the random labor income 
case are the two-period models of Leland [8], Sandmo [18], and Drfeze 
and Modigliani [5]. Their problem is: given the first period labor income 
y x and the distribution function X 2 of the labor income in period two, 
choose consumption c x in period I, 0 -< r, < .Vi, so as to maximize 
EU(c x , (1 + r)(j, - c,) + T 2 ). 

Assuming the utilities are additive, (U — u, + w 2 ), Leland [8, Eq. (25)] 
concludes that concavity and a positive third derivative imply that there 


* This research was supported by the National Science Foundation under ENG 
74-13494. 


Copyright © 1976 by Academic Press, Inc, 

All rights of reproduction in any form reserved. 


154 



EFFECT OF OPTIMAL CONSUMPTION 


155 


is a decrease in consumption when going from the deterministic income 
case to the random income case with the same mean and an infinitesimal 
random element (such that a second-order Taylor approximation is 
valid). Sandmo compares parameterized versions of y, of the form 
aYz+i 1 — «) E(Y t ), 0 ^ a ^ k, where k is such that the income remains 
nonnegative, and he demonstrates that c, is a decreasing function of a 
(and of risk) when U has decreasing temporal risk aversion. His results 
imply that in the case of a concave additive utility function, Cj is a 
decreasing function of a when the third derivative is positive. In [5] 
Drize and Modigliani look at the income and substitution effects of 
increased risk in labor income. 

The model that we will be working with is an infinite horizon additive 
utility model which the author used in [12]. There the main qualitative 
result was that for isoelastic utility functions, consumption decreases 
when we go from the deterministic labor income case to the random labor 
income case with the same mean. Here we will show that for utility 
functions belonging to the set F, consumption decreases when we go from 
any sequence of distribution functions representing labor income to a 
more risky sequence where we are using increased risk in the sense of 
Rothschild and Stiglitz [15, 17]. A concave utility function belongs to F 
if its first derivative exists everywhere and is convex. Therefore if a concave 
utility function is thrice differentiable, then it belongs to F if its third 
derivative is nonnegative. It is an easy exercise to verify that the isoelastic 
utility functions, (1/y) c v , y < 1, y =£ 0, belong to F. 

In view of the importance of a nonnegative third derivative as exhibited 
in Leland [8] and Mirman [13], it is not surprising that the third derivative 
is also the key condition in this model. Its import is made all the more 
plausible when we recall the certainty-equivalence results of Theil [22], 
Simon [21], and recently Duchan [6]. Essentially, their results state that 
with a quadratic utility function (third derivative zero) and linear state 
equation with an additive random disturbance, the decisions are unaltered 
if the random elements are replaced by their means. 

In the random capital income case with an isoelastic utility function, 
the effect on consumption of increased risk in the return on capital is 
different depending on whether y < 0 or y > 0 (Rothschild and Stiglitz 
[16]). Therefore we get the differing conclusions in the random labor 
income and random capital income models with an infinite horizon that 
Sandmo observed in his two-period model. In [4, p. 354], Diamond and 
Stiglitz have further analyzed and clarified the effect of increased risk 
in the random capital income case with an isoelastic utility function using 
the concept of mean utility preserving increase in risk. 

We will also establish two secondary results. The first is Theorem 1, 



156 


BRUCE L. MILLER 


which establishes an equivalent definition for F. The second is at the end 
of the paper where it is shown that the individual’s expected utility 
decreases when we go from any sequence of distribution functions repre¬ 
senting labor income to a more risky sequence. 

Since completing this research, two papers, Schectman [19] and Sibley 
[20], whose results overlap mine, have been brought to my attention. Both 
papers contain other topics as well. Sibley obtains the same main result 
when the (concave) utility function has a nonnegative third derivative 
and an infinite first derivative at zero. This is a more restrictive class and, 
for example, the concave quadratic utility functions are included in our 
results but not his. Schectman works with same class of utility functions F 
as we do. He also proves our Lemma 1 (his Theorem 3.6) using a quite 
different method of proof. His main result is weaker since he compares the 
stochastic case with the deterministic case and does not use the concept 
of increasing risk. 


2. The Model and the Main Results 

Except for a more general class of utility functions, the model we 
consider is the same as that presented by Miller in [12], so that we will 
limit ourselves to the bare essentials and refer the reader to [12] for dis¬ 
cussion of the model. Consider 

(xj): the state of the system where x represents the capital at the 
beginning of period j. 

r — 1: the rate of interest for both lending and borrowing where 
r > 1. 

Yj : the nonnegative random income received at the end of period j. 
It is convenient to divide Y f into certain and uncertain parts by 
Yj = Y) 4- R) where y, = sup{/j: F,(h) = 0} and F t ( ) is the distribution 
function of Y s . We also assume that the Y, are independent, but not 
identically distributed, that the means of R } are uniformly bounded, and 
that X* r r~‘y, < oo. It is significant that we do not assume that the Y, 
are identically distributed, for otherwise the optimal decision in period j 
would depend on the value of Xj and not on j. 

Dj : the amount of debt allowed in period / equals j. 

Dj is finite by our assumption above concerning the y } . Thus we allow 
the individual to borrow against certain future income and x s can take 
on value in [ — Dj, oo). 



EFFECT OF OPTIMAL CONSUMP1TON 


157 


Cj: the consumption in period j. We require that 0 < c t < x t 4- A • 

U(c x ,c t ,c s ,...): the utility function for all feasible c x , c t ,..., 
equals a <_1 «(c<)> where a is a discount factor between 0 and 1. We 
will restrict our attention to u e F where Fis defined below. 

Definition. A concave function g: I -»• R, with the convex set JC R, 
belongs to F if its first derivative, g', exists everywhere on the interior of 
/ and is convex. 

In this paper I will be [0, oo) or (0, co) if g is a utility function, and 
[— D ,, oo) if g is related to the optimal return function of period j. This 
class of utility functions was developed independently in a paper by 
Schechtman [19]. 

The decision making takes place as follows. In period 1 the individual 
has units of capital. He decides to consume Cj, where 0 < , 

and he receives a utility u(c } ). The resulting capital (or debt) grows to 
r(Xj — q) and a random income Y 1 is received at the end of period 1 

so that x 2 equals r(x v — Cj) + Y 1 . In general starting from state (x,j) the 

new state is given by 

T(x,j) = (r(x — c) + Y, ,j + 1). (1) 

By a policy S we mean a decision rule that specifies the amount 
Cj — &(x,j) that we consume given that we are in state (x,j). We let 
f e (x,j ) be the expected value of U when using an admissible policy 5 and 
starting from state (x,j), and define/ (x,j) = sup a / a (x,y). 

A policy S* is said to be optimal if f d . — f. The functional equation of 
dynamic programming is 

/(*,/) = sup {u(Cj) + *Ef(T{x,j))). (2) 

, O^CjKxj+Dj 


Some useful notation is 

h((x, j), c, r) = u(c) + <xEv(T(x,j)) 

(Av)(x,j) = sup h{(x,j), c, v). 

Thus Eq. (2) can be written as 

/(*,./) = Af(x,j), (3) 

so that the problem of finding solutions to (2) is then equivalent to the 
problem of finding fixed points of A. 

An interpretation that can be given to be function h((x,j), c, v) is that 



158 


BRUCE L. MILLER 


it represente the expected return in a one-period model where the state is 
(x,]), the decision is c, and v is the terminal reward function. In turn, 
( Av)(x,j ) represents the expected return in the same situation when an 
optimal decision is made. Often the v chosen will be the optimal return 
function. 

Let v be fixed, and for a given state (*,/) let c*(x,j) be the (feasible) 
value of c which maximizes h((x,j), c, t>). If both v and u are concave 
functions, then it is known (and also very easy to prove) that both 

c*(x t j) and x — c*(x, j) are nondecreasing functions of x. (4) 

In the event that there is more than one optimal decision we let c*(x,j ) 
be the smallest such decision. 

In this paper we will assume that we are only considering utility 
functions, a rate of interest, and a discount factor, such that the optimal 
return function / is finite-valued and can be obtained by the method of 
successive approximations 

/ = lim A”v, (5) 

n 

where ve V and A' l v means A applied n times to v. The function space V 
was defined in [12] and is made up of functions v defined on the state 
space R + X 1+ into R which are increasing, concave, and continuous in x 
for each j e 1+. Furthermore v must satisfy the following boundedness 
condition 


sup {| r(*,./)|/max(| u(x)|, £>)} < oo, 


where D is a positive constant. In order to establish (5), one needs to 
verify (for the given utility function, rate of interest, discount factor) 
that the range of A: V is V and that A is a contraction mapping. We also 
have that/belongs to V since A”v belongs to V and V is closed [12]. In 
[12] this question was examined in detail for the isoelastic functions. 
For example, with the log utility function it was shown that a unique 
finite valued / satisfying (3) exists if we restrict e(xj + Dj) < c s < 
(1 — e)(Xj + Dj) for any fixed e > 0. In order to go to the case here of 
0 < Cj < (Xj + Dj) one needs to go through the exercise of showing the 
uonoptimality (with respect to/) of the newly admissible q . The difficulty 
is that the basic papers of discrete dynamic programming, Blackwell [1] 
and Denardo [3], require that the reward function be bounded over all 
admissible states and decisions, an assumption which is not satisfied by 
any unbounded u. Only recently have techniques been developed which 



EFFECT OF OPTIMAL CONSUMPTION 


159 


get away from this restriction [7, 9, 10}. Fortunately, there is no difficulty 
whatsoever in the finite period case, so our results apply without quali¬ 
fication for all us F. 

Theorem 1 . A concave function g: I -*■ R, where the convex set I C R, 
belongs to F if and only if f or every set of \ t , A ,, i = 1,..., n, satisfying 
A, Ss 0, £ A, = 1, and £ A, A { = 0, 


S(*i) - t, *<S( X i + Ai) > £(*«) - E 


( 6 ) 


where x t ^ x t , x 2 , x l are in the interior of I. From Rockafeller [14, 
Theorems 23.1, 24.1, 24.2, Corollary 24.2.1] the convexity of g' implies 
that 


(a) the right-hand and left-hand derivatives of g , g* , and g* , exist 
everywhere on the interior of I, are increasing, and satisfy g' > g* ; 

(b) for any a, be I 

g\b) — g\a) — Cg' + (t)dt = Cg"_(t)dt. 

J a J a 

Proof. We begin the proof of the “if” part by establishing that g' 
exists everywhere on the interior of /. Assume the contrary, that is for 
some x, the derivative does not exist. Since g is concave, both the right- 
and left-hand derivatives exist at x and we must have gf(x) — g-'C*) = 
k < 0, or 

lim + >0 ~ g(*) ~ fOO + g(x - y) = k < o 

no y 

Equation (6) implies that (-g(x) + (g(x + y)/2) + (g(x — y)j 2)) is an 
increasing function of x. Consequently, the derivative cannot exist for 
an x' < x which is inconsistent with the concavity of g. 

It remains to show that for any x 2 > x 2 in the interior of I and 
0 < A < 1, 

-g'(Ajc, + (1 - A) x 2 ) + Ag'(*i) + (1 - A) g'(x 2 ) > 0. 

The derivatives equal the right hand derivatives so that the left hand side 
equals 

(l/y)[—+ (1 — A) x t + y) + g(A*j + (1 - A) xj 

+ Agt*! + y) ~ Ag(xj) + (I - A) g(x 2 + y) - (1 - A) g(x 2 )]. 


642/13/1-1i 



160 


BRUCE L. MILLER 


The term in brackets is nonnegative, since by (6) 

g (+ (1 - A) x t ) - A g(xj - (1 - A) g(x 2 ) 

^ g( Xx 1 + (1 - A) x t + y) - Ag(x, + y) - 0 ~ A) g(x 2 + y). 

To establish the “only if” part, we must show that geF implies (6) 
which we rewrite as 

X A/Igfxj) - g(x x + A,) - g(x 2 ) + g(x 2 + Ai)\ > 0. 


For any /', 


(g(x x ) — g(x! + A { ) - g(x 2 ) -F g(x 2 + A t )) 


I g'(y)dy+ f g\y)dy, 

J x x J x 2 




= — I ["g'(*i) -i- J g”(z) dz | Jy 

r 3 ’ 2 ' 1 T r® 1 

+ | *'(**) - | g’ + (z)^ Uy, 

**2 L ‘*2 J 

using (b) of Theorem 1, 


> -Ai[g'(x i) - g'(*a)]. 


(7) 


since g* is increasing. To see this inequality, observe that g"(z 2 ) > g+(z 1 ) 
where z 2 is the same distance above x 2 that z 1 is above Xj when A, > 0. 
If A t < 0 then y < Xj or y < x 2 as the case may be, and g”(z 2 ) ^ g'(z,) 
where z 2 is the same distance below x 2 that is below Xj. Therefore 

X A,(g(xi) - g(Xi + AA - g(x 2 ) -f g(x 2 + A^) 

> ~ X A,d ( (g'(xx) - g'(x,)) = 0. 

Q.E.D, 

We note that F is large enough to include utility functions u whose 
absolute risk aversion, —u"ju', is decreasing. This is true since «' is 
decreasing by concavity and therefore we must have — u* decreasing. 

We also want the result that if g e F and Z is any random variable with 
zero expectation such that £g(Xj + Z) and Eg(x 2 + Z) exist, then 

£(g(*i) - gOi + Z) - g(x 2 ) + g(x 2 + Z)] ^ 0, when x 2 > x x . 

( 8 ) 



EFFECT OF OPTIMAL CONSUMPTION 


161 


This follows from (6) if Z is a simple function. In order to go from simple 
functions to random variables we apply the same method of proof as 
that in Chung [2, Theorem 9.1.4]. The result is also true if we replace 
Xj and x 2 by x x + X and x 2 + X where A' is a random variable, and Z is 
a random variable such that E(Z | X = x) = 0 for all x. Again assuming 
that all expectations are defined we get (by conditioning on X — x) 

E[g(x i + X) — g( x i + X + Z) — g(x a + X) -j- g(x 2 + X + Z)] 0. 

(9) 

If AT is a nonnegative random variable with a finite mean, then v(x) = 
Eg(x + X) also exists and is concave. If we let Z be the discrete random 
variable P(Z — A,) = A,, where d,, A f , i = 1,..., n, have the properties 
of those same terms in (6), and be independent of X , then (9) shows that 

veF. 

Next, we examine the idea of increasing risk as defined by Rothschild 
and Stiglitz [15, 17], There they establish the equivalence of three measures 
of risk when comparing two random variables. The definition most 
useful for our purposes is that Y is more risky than X if and only if 

Y = X+ Z 

4 

where — „ means “has the same distributions as” and Z is a random 
variable such that E(Z \ X — x) - 0 for all x. Clearly from (9) we have 
that if Y is more risky than X (and all expectations exist) then geF and 
x 2 > Xj imply that 

E[g(x i + X) - g(x x + T) - g(x 2 + X) + g(x 2 + T)] ^ 0. (10) 

The proof of Theorem 2 starts with a lemma which shows that if our 
utility function belongs to F ’, then the optimal return function / belongs 
to F. It is just this step that is required in a multiperiod model, but not 
in a two period model. As we mentioned in the introduction, Schectman 
[19] has independently established this lemma for the finite horizon case 
using a quite different method of proof. 

Lemma 1 . If u e F, then f e F. 

Proof. We have assumed that we are only considering problems such 
that (5) holds. We set v equal to the 0 function, which belongs to F, and 
we need to show that if g = A n v e F, then Ag = A n+1 v e F. This is 
sufficient since (6) can be used to easily show that the limiting function, 
f belongs to F. 



162 


BRUCE t. MILLER 


It is known (for a proof in this particular case, see Miller f 12]) that if g 
is concave then Ag is concave. We need to show that if g{x,j)eF (and 
hence is concave) then the concave function Ag satisfies 

Ag(x x i /) — X 4- , j) — Ag(x t ,j) + X ^Ag(x t + ,j) ^ 0. 

<-i <-i 

01) 

Let c x ‘, i — 1.n, be the optimal decision (with respect to g) for the 

states + A l and c s and c, be the optimal decisions for the states .r, 
and x x respectively. From (4) we know that c 2 > q and jc 2 — c 2 > x x — c,. 

Let Ci‘, i — 1. n, be the decisions associated with the state x 2 + J,, 

and be given by c 2 = c 2 — c x + cj. They are feasible (0 < c 2 ' < x z + D,) 
since the Cy* are feasible, c 2 ^ c x , and x t — x t > c, — c ,. 

We have that 

Ag(x i + ,./') = u(c x { ) + <xEg(x x + A t — c x l + Yj,j + 1), 

and similar equations hold for x 1 and x t . Since the cj may not be optimal 

Ag(x 2 + Af ,/) ^ u(c 2 ‘) + Qi£g(x a + -d, — c 2 * + Yj ,j + 1). 

Let A x l = ci‘ - c, ^ Cj 1 - c 2 . By (7) 

u(cy) — ufa*) — i/(c 2 ) + u(c 2 ‘) > —A^u'fct) - m'(c 2 )), 

since c 2 ^ c x and i/ef. By the development after Eq. (9), the function 
«’(*) = <xEg(x -(• Yj ,j -f 1) belongs to F. Therefore by (7) 

(v(Xj - Ci) - v(x x -j-A,- c x <) - v(x 2 - c 3 ) + v(x 2 + A, — c 2 ‘)) 

^ — (A, - AyAlvXxy — c t ) — v'(x 2 — c 2 )], 

since v e F and x 2 - c 2 ^ - x x , and -f A t — Cy*) — (j^ — c x ) — 

(*« + Af — c 2 ) — (x 2 — c 2 ) = Aj — Aj*. Combining the above equalities 
and inequalities we have that the left-hand side of (11) is greater than or 
equal to 

- X UAAuXcy) - u'(c 2 )) + (A t - 4'Xi/Cd - c,) - v'(x t - c 2 ))]. 

(12) 

If both c 2 and c 2 are interior points of their respective constraint sets, 
0 < Ci < Xj + Dj and 0 < c 8 < r 8 + D,, then u'(cj) = c'(JCi — c x ) and 
»'(c 2 ) = v'(x s — c 8 ), by the optimality of c x and c a and the fact that the 
derivatives of u and v exist everywhere in the interior. In this case (12) 
equals — £ A, A,[u'(c x ) - «'(c 2 )] = 0. 



EFFECT OF OPTIMAL CONSUMPTION 


163 


We will consider the boundary cases of c x = 0 or c x = + D, and 

c t ■= 0 or c t = x t + Dj by giving the proofs for the cases Cj = 0 and 
c, = 0 only. A similar situation arises in the proof of Theorem 2, and there 
we give the proofs of the cases c x = x t 4- D t and c t = x 8 + D f only. 

One possibility at the boundary is c x at a boundary, say c x = 0, but 
fj is not. In order to apply (12) where c x is a left end point we need to 
verify that 

»(*) - «(* + A,) ^ - J [m + '(x) + J* «'(*)] (13) 

for x = ^ . Since u is nondecreasing u(c t ) < lim*^ w(x). If u(c\) < 
linixie, «(x), then w + '( c i) = +ao and (13) holds. If i/(c t ) = lim*^ «(x), 
then (13) holds for x — c l since a/fo) > lim Iiri n + '(x) = lim« Ac u'(x) 
[14, Theorem 24.1], and (13) holds with equality for all x > c x . Returning 
to the main argument, the optimality of c x implies that u + ’(c t ) ^ t/(*i — Cj) 
and clearly AJ 0 for all /. Therefore (12) is greater than or equal to 


- I Wlx, - Cl ) + (A, - A ,0 c'(x, - c,)] - I A^-u'fo)) = 0. 


The other possibility at the boundary is c 2 at a boundary, say c 2 = 0. 
By (4) Cj = 0, and therefore c,' = c 2 ( . Then (11) becomes 


v (*i) - £ ^<r(xi + Ai — c,<) - v(x t ) + £ A,K*a + A, — cj). 


This quantity is nonnegative since i^xO — £ A,c(xi + A) — v(x t ) + 
Z A,y(x 2 + A) ^ 0 by (6) since veF, and v(x x -f A 4 ) — i!(x x + — Cj') — 

r(xj> + A) + v(x 2 -\- Ai — c 2 ‘) 0 for all i since v is concave. 

Theorem 2. Let u e F and X x , X 2 .... be a sequence of random variables 
describing labor income (Case a), and y x , T a be a second sequence of 
random variables describing labor income (Case b). If for each i, Y t is 
riskier than X t , then the optimal amount to consume as a function of the 
state (x,j) in Case a is greater than the optimal amount to consume in 
Case b. 

Verifying the hypothesis of the following lemma leads directly to a 
proof of Theorem 2. 

Lemma 2. Let f x be the optimal return function in Case a and f Y be the 
optimal return function in Case b. If d(x,j) = fr(x,j) — fx(x,j) is a 
nondecreasing function of x then the conclusion of Theorem 2 holds. 



164 


BRUCE L. MILLER 


Proof. By Lemma 1 we know that f x ,f Y e F. Let c* be the optimal 
decision for state ( x,j) with the optimal return function f x .For c > c \ 

h((x,j), c*,fr) - h((x,j), C,f y ) 

= u(c*) + <xEfx(r(x — c*) + yi ,j + 1) 

4- a.Ed(r(x — c*) + Yj ,j + 1) 

— u{c) — <xEf x (r(x — c) + Y } J -f 1) 

— aEd(r(x — c) + Y s ,j + 1). 

Since d is nondecreasing and c > c* we have that the right-hand is 

4s u(c* + olE fx(r{x — c*) + Y ,, / + 1) 

— u(c ) — aEf x (r(x — c ) + Y } ,j + 1) 

^ «(c*) + - c*) + X,J+ 1) 

-- «(<■) — <xEf x (r(x — c) -f A'j ,j + 1) 

by (10) since r(x — c*) > r(x — c) and fx^F 

— h((x,j), c*,f x ) —h((x,j).c.fx) >- 0, since c* is optimal. 

Therefore the optimal amount to consume in Case b is less than or 
equal to c*. Recall that in case of ties (which could not happen with 
strict concavity) we pick the smallest c. Q.E.D. 

The following lemma from [12] is needed to establish that d is non¬ 
decreasing. Like/, h and A vary with Case a and Case b, but we suppress 
this dependence in the notation. 

Lemma 3. Consider the model in the case where y x , Y 2 are the 
random variables describing labor income (Case b). Let v e V and suppose 
that v satisfies Condition A below. Then f Y (x, j) — v(x, /) is a ndbdecreasing 
function of x. 

Condition A. Given any two states x lt j and x 2 ,j, x 2 > , and 

decision c x for (Xj, /), there is a feasible decision c 2 for (x 2 ,j) such that 

(a) x 2 - c 2 > x x - c x , 

(b) (v(x x ,j) — h((x x ,j), c x ,v) — v(x 2 J) + h((x 2 ,j ), c 2 , v)) ^ 0. 

Equation (b) by itself is a necessary and sufficient condition that 
Av — v be a nondecreasing function. The proof of the lemma consists of 
verifying an induction hypothesis in order to show that A n v — v is non- 
decreasing, and using (5) f Y = lim,,^*, A n v. 



EFFECT OF OPTIMAL CONSUMPTION 


165 


Proof of Theorem 2. By Lemma 2 and Lemma 3 we need to show that 
Condition A holds where we let v = f x , Recall that below Eq. (5) we 
showed that /belongs to V. Let c x * and c 2 * be the optimal decisions for 
states (*!,./) and (* 2 ,j) with the optimal return function f x ■ Given a 
c x we set c 2 = Cy + c 2 * — c{*. 

Since c x * and c 2 * satisfy (6), c 2 * ^ c x *, and x 2 — x x > c,* — c x *. 
Thus c 2 is feasible since Cy is feasible, and c, — c x = c 2 * — Cy* ^x t — x x 
and (a) of Condition A holds. 

Recalling that f x — Af x , the left-hand side of (b) in Lemma 3 equals 

u(ci*) + <xEf x (xi - c x * + Xj ,j + 1) - u(ci) - xEfxixy — Cy + Y, ,j + 1) 

- u{c 2 *) — aEf x (x 2 - c 2 * + X t J -f 1) 

+ w(c 2 ) + otEf x (x 2 — c 2 + Yj ,j -f- 1) 

> U(Cy*) + OiEfxiXy - Cy* + Xj ,j + 1) 

- U(Cy) - « Ef X (Xy — Cy + Xj ,j + 1) 

- u(c 2 *) - acEfx(Xr ~ c 2 * + Xj ,j + 1) 

+ u(c 2 ) ~f- Ef x (x 2 — c 2 -h Xj ,j -f- 1) (14) 

by (10) since x 2 — c 2 ^ x x — Cj and f x eF. 

Now let A — c 2 — c 2 * = Ci — Cy*, and we will show that (14) is non¬ 
negative. By (7) (u(cy*) — u{Cy) — u(c 2 *) + u(c 2 )) > — A{u'(cy*) — u'(c 2 *)), 
since ueF and c 2 * > Cj*. As in the proof of Lemma 1 we let 
v(x) = aEg(x + X ,,./). Then v e F and (c(xj — c,*) — v(xy — c x ) — 
v(x 2 — c 2 *) + v(x 2 — c 2 )) 2* A(v'(x i — Cy*) — v'(x 2 — c 2 *)). Therefore (14) 
is greater than or equal to 

A(—U'(Cy*) + t>'(*I — Cy*) + u\c 2 *) — v'(x 2 — c 2 *)). (15) 

If both Cy* and c 2 * are interior points of their constraints sets, then 
u'(cy*) = i/(*i — c x *), w'(c 2 *) = v'(x 2 — c 2 *) and (15) equals zero. If 
c{* is at a boundary, say c x * = Xy -f- Dj , and c 2 * is not, then u'(c t *) > 
c + '(*i — Cj*) and u'(c 2 *) — v'(x 2 — c 2 *). Since A must be nonpositive in 
this case, (15) will be nonnegative. Here we use (13) as applied to v. If 
c 2 * is at a boundary, say c 2 * = x 2 + D t , then c x * must equal x x 4- D ,, 
and v(x 2 — c 2 *) — v(x x — c\*) and v(x 2 — c 2 ) = v(xy — Cy). Then (14) 
becomes («(c x *) — w(cj) — u(c 2 *) + u(c 2 )) which is nonnegative by the 
concavity of u. Q.E.D. 

It is also interesting to verify the plausible result that a less risky income 
stream is more desirable, that f x > f r ■ Let A x and A r refer to Case a 



166 


BRUCE L. MILLER 


and Case b respectively. If v is a function which satisfies A r v < v then by 
monotonicity Ay*v ^ AyV, and hence Ay n v ^ v which shows that 
J'y — lim n ^ Ay n v < v. We let v =f x and the argument is complete 
when we show that A r f x f x = A x f x . For any feasible decision c, 
u(c) + a£f x ( x-c+Y it j + l)< u(c) + ocEf x (x - c + X, ,j + 1) since 
fx is concave ancf Y, is more risky than Xj . This inequality is in fact an 
equivalent definition of riskiness [15]. Maximizing over c gives the desired 
result. 


References 

1. D. Blackwell, Discounted dynamic programming, Ann. Math. Statist. 36 (1965), 
226-235. 

2. K. Chung, "A Course in Probability Theory,” Harcourt, Brace and World, 
New York, 1968. 

3. E. Denardo, Contraction mappings in the theory underlying dynamic programming, 
SIAM Rev. 9 (1967), 165-177. 

4. P. Diamond and J. Stic.litz, Increases in risk and in risk aversion, J. Eton. Theory 8 
(1974), 337-360. 

5. J. Dreze and F. Modigliani, Consumption decisions under uncertainty, J. Econ. 
Theory 5 (1972), 308-335. 

6. A. Duchan, A clarification and a new proof of the certainty equivalence theorem, 
Internal. Econ. Rev. 15 (1974), 216-224. 

7. M. Harrison, Discrete dynamic programming with unbounded rewards, Ann. 
Math. Statist. 43 (1972), 636-644. 

8. H. Leland, Saving and uncertainty: The precautionary demand for saving, 
Quarterly J. Econ. 82 (1968), 465-473. 

9. S. Lippman, Semi-Markov decision processes with unbounded rewards. Management 
Science 19 (1973), 717-731. 

10. S. Lippman, On dynamic programming with unbounded rewards, Western Manage¬ 
ment Science Institute, University of California, Los Angeles, Working Paper 
No. 212, November 1973 to appear in Management Science. 

11. R. Merton, Optimal consumption and portfolio rules in a continuous time model, 
J. Econ. Theory 3 (1971), 373-413. 

12. B. Miller, Optimal consumption with a stochastic income stream, Econometrica 42 
(1974), 253-266. 

13. L. Mirman, Uncertainty and optimal consumption decisions, Econometrica 39 
(1971), 179-185. 

14. R. T. Rockapeller, “Convex Analysis,” Princeton Univ. Press, Princeton, N. J., 
1970. 

15. M. Rothschild and J. Stiolitz, Increasing risk: I, A definition, J. Econ. Theory 1 
(1970), 225-243. 

16. M. Rothschild and J. Stiglitz, Increasing risk: II, Its economic consequences, 
J. Econ. Theory 3 (1971), 66-84. 

17. M. Rothschild and J. Stiglitz, Addendum to “Increasing risk: I, A definition", 
J. Econ. Theory 5 (1972), 306. 

18. A. Sandmo, The effect of uncertainty on saving decisions, Rev. Econ. Studies 37 
(1970), 353-360. 



EFFECT OF OPTIMAL CONSUMPTION 


167 


19. J. Schectman, Some applications of competitive prices to dynamic programming 
problems under uncertainty, ORC 73-5, Operations Research Cento 1 , University 
of California, Berkeley, March 1973, to appear in -7. Econ. Theory. 

20. D. Sibley, Permanent and transitory income effects in a model of optimal consump¬ 
tion and wage income uncertainty, J. Econ. Theory 11 (1975), 68-82. 

21. H. Simon, Dynamic programming under uncertainty with a quadratic criterion 
function, Econometrica 24 (1956), 74-81. 

22. H. Theil, A note on certainty equivalence in dynamic programming, Econometrica 
25 (1957), 346-349. 

Received: March 13, 1975; revised: February 23, 1976 


Bruce L. Miller* 
Engineering Systems Department 
University of California 
Los Angeles, California 90024 


* I thank Nils Hakansson, Steve Lippman, and the referee for contributing several 
suggestions. 


Printed tn Belgium 




JOURNAL 



THEORY 


Volume 13, Number 2, October 1976 


Copyright © 1976 by Academic Press, Inc. 
All Rights Reserved 


No part of this publication may be reproduced or transmitted in any form or by any 
means, electronic or mechanical, including photocopy, recording, or any information 
storage and retrieval system, without permission in writing from the copyright owner. 


Published bimonthly at 37 Tempelhof, Bruges, Belgium 
by Academic Press, Inc., Ill Fifth Avenue, New York, N.Y. 10003 
1976: Volumes 12-13. Price per volume: $39.50 U.S.A.; 

$42.50 outside U.S.A. (plus postage). 

1977: Volumes 14-16. Price per volume: $44.00 U.S.A.; 

$47.00 outside U.S.A. (plus postage). 

Information concerning personal subscription rates may be obtained 
by writing to the Publisher. 

All correspondence and subscription orders should be addressed to the office of the 
Publishers at 111 Fifth Avenue, New York, N.Y. 10003. 

Send notices of change of address to the office of the Publishers at least 
6-8 weeks in advance. Please include both old and new addresses. 

Second class postage paid at Jamaica, N.Y. 

Air freight and mailing in the U.S.A. by Publications Expediting, Inc, 

200 Meacham Avenue, Elmont, New York 11003. 

Copyright © 1976 by Academic Press, Inc. 

Printed in Bruges, Belgium, by the St. Catherine Press, Ltd. 





JOURNAL OF ECONOMIC THEORY J3, 169-183 (1976) 


Sequences of Temporary Equilibria, Stationary 
Point Expectations, and Pareto Efficiency* 

Lars E. O. Svensson 

Institute for International Economic Studies, Fack, S-104 05 Stockholm, Sweden 
Received June 17, 1975; revised March 1, 1976 


Introduction 

1. Suppose we have a two-date exchange economy, where con¬ 
sumers know their preferences and endowments at both dates with 
certainty. Suppose that at the first date there exists a complete set of spot 
and forward markets. The forward contracts for deliveries at the second 
date are such that payments take place at the first date. There are no 
possibilities to trade and conclude new contracts at the second date. 
This is then a special case of the familiar Arrow-Debreu market 
structure [3], 

Under some conditions on the endowments and preferences, there will 
exist a market equilibrium at the first date. Deliveries contracted on 
the spot markets are then carried out, and the economy moves to the second 
date, where deliveries contracted on the forward markets at date 1 are 
carried out. The resulting allocation is Pareto efficient, i.e., there is no 
other feasible allocation such that at least one consumer is better off 
and no consumer is worse off. 

Suppose, -however, that there exists a spot market at date 2 as well, 
and that consumers at date 1 know that they also can conclude contracts 
at date 2, i.e., we have an overlapping sequential market structure, over¬ 
lapping in the sense that deliveries at date 2 can be contracted at both date 
1 and 2. Does the existence of a market at date 2 matter? Will the allo¬ 
cation be different compared to the Arrow-Debreu case ? More precisely, 

* I have benefited from comments by Peter Diamond, Stanley Fischer, Frank Fisher, 
Jerry Green, Karl Jungenfelt, Robert Solow, Hal Varian, and an anonymous referee. 
1 would like to acknowledge the help that I received from participants in a joint M.I.T.- 
Harvard seminar and in my seminar at the Institute for International Economic Studies. 
Remaining errors and obscurities are, of course, my own. Financial support from the 
Royal Academy of Sciences, the University of Stockholm, the Stockholm School of 
Economics, the Siamon Foundation, and the Fulbright Commission is gratefully 
acknowledged. 


Copyright © 1976 by Academic Pres,, Inc. 

AU rights of reproduction in any form reserved. 


169 



170 


LARS E. O. SVENSSON 


will the allocation be Pareto efficient? This paper attempts to provide 
some answers to these questions. 

2. At the first date, given prices on the spot and forward markets, 
consumers are assumed to calculate trade plans for trade on the markets 
at date 2. These plans will be influenced by their expectations of the 
uncertain trading possibilities at the second date. A temporary equilibrium 
at the first date will be a price vector and a collection of trade plans such 
that the spot and forward markets are cleared. Spot deliveries are carried 
out, and the economy moves to the second date. 

At the second date, deliveries corresponding to forward contracts at 
date 1 are carried out. The consumers then dispose of new endowments 
that are the sum of their initial endowments at date 2 and these precon¬ 
tracted deliveries. Given prices on date 2 spot markets, they calculate 
trading plans for date 2. A temporary equilibrium at date 2 is then a price 
vector and a set of trade plans such that spot markets at that date are 
equilibrated. Deliveries according to these date 2 contracts are carried 
out, resulting in a final allocation in the economy. 

First, it is not clear whether a temporary equilibrium will exist at the 
first date, and it is even less clear that there will exist a temporary equi¬ 
librium at the second date, even if the corresponding Arrow- Debreu 
case has an equilibrium. With no restrictions on short sales, all consumers 
may not end up with nonnegative endowments at date 2, and bankruptcies 
may occur. Second, if temporary equilibria exist, the properties of the 
resulting final allocations are not obvious; it is especially unclear whether 
or not the allocations will be Pareto efficient. 

The heart of the matter is how expectations at date 1 of trading possi¬ 
bilities at date 2 are formed, and how they influence trading plans at date 1. 
Even when preferences and initial endowments are known, the existence 
of a market at date 2 introduces a crucial uncertainty at date 1 about 
trading possibilities at date 2, and how consumers react to this uncertainty 
is decisive. If consumers at date 1 believe that the economy will reach a 
temporary equilibrium at date 2, they believe that they can complete their 
desired transactions at date 2, and the relevant uncertainty is only about 
what equilibrium prices will rule at date 2. 

We should emphasize that this uncertainty about the prices in the future 
is qualitatively very different from environmental uncertainty, that is, 
uncertainty about preferences and endowments. This is because prices 
at date 2 are dependent upon consumers actions, while with environmental 
uncertainty the state of the world that actually occurs cannot be influenced 
by any consumer’s actions. The difference between these two kinds of 
uncertainty has been stressed by Radner [13] and Kurz [11]. 



SEQUENCES OF TEMPORARY EQUILIBRIA 


171 


To concentrate on price uncertainty, we prefer to use a model with no 
environmental uncertainty. However, the introduction of such uncertainty, 
together with complete contingency markets as in Debreu [3, Chap 7], 
would not affect our main results. 

It might be asked what concept of efficiency is the proper one in this 
context. For reasons given below, we will restrict ourselves to the Pareto 
efficiency of the ex post allocation resulting from a temporary equilibrium 
at the first date and a consecutive temporary equilibrium at the second 
date. 

If consumer’s expectations at date 1 take the form of probability distri¬ 
butions of the price at date 2, plans at date 1 would be functions of these 
distributions. A risk-averse consumer would at date 1 most probably 
plan to trade both on forward markets at date 1, and on spot markets 
at date 2, contingent upon what prices actually occur at date 2. There seems 
to be no reason whatsoever to believe that the contingent plans at date 1 
for date 2 would show some consistency, and the resulting final allocation 
would normally not be efficient. If observed forward prices at date 1 
differed significantly from the subjective distributions for prices at date 2, 
all sorts of speculative positions might be taken, and this would certainly 
destroy the efficiency of the allocation. It thus seems rather obvious that 
except in very special cases, expectations in the form of probability distri¬ 
butions would not lead to Pareto efficiency. 

Suppose instead that consumers have point expectations, that is, that 
they expect specific prices to rule at date 2 with complete certainty. We 
then realize that for a temporary equilibrium to exist at the first date, 
point expectations must be unanimous and such that the expected spot 
prices at date 2 are exactly proportional to the forward prices at date 1. 
If some consumer had point expectations that differed from the observed 
forward prices, he would think he had the possibility of sure arbitrage, 
and with no restriction on trades he would wish to transact an infinite 
amount on the forward markets, and this cannot be consistent with a 
temporary equilibrium at the first date. However, if a temporary equi¬ 
librium with different point expectations—because of arbitrary restrictions 
on traded amounts—somehow existed at date 1, it is clear that at date 2 
some consumer’s expectations would not be fulfilled, and this would 
most probably lead to a final allocation that is not Pareto efficient. 

3. For these reasons, we will assume that all consumers expect 
the spot prices at date 2 to be proportional to the forward prices at date 1. 
We call such expectations stationary point expectations. If there is any 
kind of expectation that would lead to Pareto efficiency, this ought to be it. 
The assumption rules out all (conscious) speculation, and it might 



172 


LARS E. O. SVHNSSON 


intuitively be thought that in a world with no uncertainty about preferences 
and endowments, nobody would expect anything dramatic to happen, 
and consumers would stick to this somewhat conservative way of forming 
expectations. 

We note that in the literature stationary expectations often refer to 
the case when future spot prices are expected to equal present spot prices. 
In this model with both present forward trade and future spot trade, it is 
clearly more appropriate to use the concept of stationarity in our way. 

The question we want to answer then is whether stationary point 
expectations will lead to Pareto efficient allocations, i.e., given a temporary 
equilibrium with those expectations at date 1, will the temporary equi¬ 
librium at date 2 be such that we get efficiency? 

It is easy to show that a sufficient condition for Pareto efficiency is 
that the actual equilibrium prices at date 2 are proportional to the forward 
prices at date 1, and hence that expectations are being fulfilled. A test 
for Pareto efficiency is then whether sequences of temporary equilibria 
are such that expectations are fulfilled. 

4. Existence of temporary equilibria has been proved under 
various assumptions by Stigum [17], Radner [14], and Green [6, 7], 
Stigum [18] has studied the efficiency of a sequence of temporary equilibria. 
Hart [9] has studied existence and optimality when markets are incomplete. 
Radner [14] and Jordan [10] have dealt with the existence of fulfilled 
expectations. 

Only Radner and Green, though, have dealt with an overlapping 
sequential market structure. The question whether stationary point 
expectations in that case lead to fulfilled expectations and hence efficiency 
has, according to my knowledge, not yet been more thoroughly dealt 
with, and the main results in this paper are believed to be new. The few 
superficial remarks in the literature I have found about this case [15, 
p. 233, 12, p. 333, 19, p. 287], all suggest positive answers to the question, 
a conclusion to which 1 think the intuition of most economists would 
lead. 


5. The results of this paper, however, turn out to be strongly 
negative. There do indeed exist sequences of temporary equilibria such 
that expectations are fulfilled and hence the allocation is Pareto efficient. 
We can actually show that for each Arrow-Debreu equilibrium there exists 
a sequence of temporary equilibria such that expectations are fulfilled 
and the allocations are the same, and conversely that for each sequence 
of temporary equilibria with expectations fulfilled there exists an Arrow- 
Debreu equilibrium with the same allocation. 



SEQUENCES OF TEMPORARY EQUILIBRIA 


173 


But, independent of the number of Arrow-Debreu equilibria, there will 
normally exist an infinite number of other temporary equilibria at date 1 
such that either the temporary equilibrium at date 2 has prices not pro¬ 
portional to forward prices at date 1, or there does not exist a temporary 
equilibrium at date 2 at all. There is no reason why at the first date a 
temporary equilibrium should be picked so that expectations are fulfilled 
at the second date, and normally expectations would be violated. Except 
in very special cases, violated expectations imply that the resulting allo¬ 
cation is not Pareto efficient. 

The main reason for these results is that with stationary point expec¬ 
tations, consumers are indifferent to the physical composition of the com¬ 
modity bundles they plan to trade on forward markets at date 1 and spot 
markets at date 2, if only the values and the sum of the bundles are 
unchanged. There is nothing then to guarantee that plans for date 2 
are consistent, and any inconsistency will not show up until date 2, when 
the trade at date 1 cannot be undone. 

The model used is a two-date exchange economy, but it is clear that 
the results carry over to a model involving production and several 
dates. 

The model and its results together with an example are presented in 
Sections 6 through 12. Further conclusions and interpretations of the 
results are discussed in Sections 13 through 19. 


The Model and its Results 

6. The economy has Q goods, two dates and I consumers. Each 
consumer i (i = l,...,/) is characterized by (x,-, u t ). x t = (x tl , x <2 ) e 
R+° X R± Q is the endowments of consumer i at date 1 and date 2. It is 
assumed that x { ^>0. u t : R + Q x R + Q -> R is the utility function 
of consumer i, and «,(x ( ) is the utility derived from consumption 
x, = (x tl , x i2 ) £ R+° X R + °, where x n is consumption at date 1 and 
x ,'2 is consumption at date 2. We assume that w, strictly quasi-concave, 
and strictly increasing in each good. 

7. In the familiar Arrow-Debreu market structure, there exist 
2 Q markets at date 1: Q spot markets and Q forward markets for deliveries 
at date 2. Let p — (pt ,p 2 )eR + ° x R+° be the Arrow-Debreu prices. 
Pi being the spot prices and p 2 the forward prices. Also, let r< = (r rt , 
*«) e R° x R° denote (net) trade by consumer i, where z n is trade on the 
spot markets and z n is trade on the forward markets. 



174 


LARS E. O. SVENSSON 


At date 1, given a price vector p, each consumer i faces a decision 
problem 

max «,(*<). 
s.t. x n = x tl + z n , 
x it = x it + z a , 

Pi z n + PtZt 2 ^ 0. 


Definition. An Arrow-Debreu equilibrium (p, (z t *)) is a price vector p 
and a trade allocation (z t *) such that: 

(i) for each i = 1. 1, z* maximizes u ( (x 4 ) on the set of all 

(.x ( , Zt) e R*° x R 1Q that fulfill the constraints (1); 

(ii) 2 *i — 0, i.e., spot markets clear, 

z,j = 0, i.e., forward markets clear. 

The allocation (x t *) — (x ( + z t *) is called the Arrow-Debreu allocation. 
As is well known, an Arrow-Debreu allocation is Pareto efficient. 

Our assumptions about the endowments and the utility functions ensure 
the existence of at least one Arrow-Debreu equilibrium. (It can be shown 
that the conditions for [1, Theorem 3, p. 33] are fulfilled.) 


8. We will now study an overlapping sequential market structure. 
At date 1 there still exist 2 Q markets: Q spot markets and Q forward 
markets. However, contrary to the Arrow-Debreu case, there exist Q 
spot markets at date 2 as well. 

Let = (z]j, z} 2 ) e R Q x R° denote (net) trade at date 1 by consumer 
», where z l tl is trade on the spot markets and zj, is trade on the forward 
markets. Let p 1 = (p, 1 , p 2 l ) e R + ° x R+° denote prices at date 1 on the 
spot and forward markets. Similarly, we let z< s e R° denote trade at date 2 
by consumer i and p z e R+° denote prices at date 2. 

At date 1 consumers have expectations of what prices p 8 will rule on the 
markets at date 2. We assume that all consumers expect prices at date 2 
to be proportional to forward prices at date I with certainty, i.e., they 
expect, with probability one, that 

p* = Apj 1 (A > 0). 

This kind of expectations we call stationary point expectations. 

At date l, given prices p 1 , each consumer / calculates a consumption and 
trading plan by solving the following decision problem, thus taking into 



SEQUENCES OF TEMPORARY EQUILIBRIA 


175 


consideration the possibilities of trade at date 2 and having stationary 
point expectations. 

max ufa !, x <g ), 
s.t. x a = x n + 4 

•'‘it = ~t" z )» "t" 2 <*> 

Pi z a + P* 2 \t < 

(W) 2 <* < o. 

We note that the plans will be independent of A, so we might as well assume 
that there is one unique A for all consumers. Also we will say that expec¬ 
tations are fulfilled whenever the actual prices at date 2 are proportional 
to forward prices at date 1, independent of the value of the proportionality 
constant. 

Definition. A consumption plan and a trade plan at date 1 for consumer 
i, given prices p 1 , are vectors (x* , x£) and (z} 2 *, zjf, zj*) respectively, 
such that they maximize «<(*i) on the set of all (x<, z^, z,*) e /t*° x 
R i0 X R° that fulfill the constraints (2). 

At date 2, given consumptibn x a at date 1, forward trade z} 2 at date 1, 
and prices p *, each consumer i calculates a consumption and trade plan 
for date 2, by solving the following problem. 


( 2 ) 


max U((x tl , x i2 ), 

s.t. x it = x <2 + zj 2 + z?, (3) 

PW < 0 . 

Definition. A consumption plan and a trade plan at date 2 for con¬ 
sumer i, given x a , z] t and prices p l , are vectors and z\* such that they 
maximize u { (x a , x <2 ) on the set of all (x, 2 , z, ! ) e x R a that fulfill 
the constraints (3). 

Lemma 1. Let (zj*, zj*, zf*) be a trade plan at date 1 for consumer i, 
given p 1 . Let x ( * be the corresponding consumption plan at date 1. Then for 
my z a e R° such that p 2 1 z 2 = 0, (zj*, zj 2 * — z g , zj* + z 8 ) is a trade plan 
with the same consumption plan x,*. 

This lemma says that a consumer is indifferent to the physical decompo- 
position between forward trade at date 1 and spot trade at date 2, when 
neither their sum nor their values change. The lemma is crucial for the 
following results. 



176 


LARS E. O. SVENSSON 


Proof. Xi* and ( z}*, z\? - z 2 , z 2 t * + z t ) fulfill the constraints (2). 
Since 

= * i2 + z\* + z 2 * = x i2 + (zj* - z 4 ) + (z 2 * + z 2 ), 

it is obvious that x ( * and (z^*, zj 2 * - z 2 , zj* + z 2 ) maximize u t (x t ) on the 
set of all (Xi, z^, z, 2 ) e R 2 ° x R 20 x R° that fulfill the constraints (2). 

Q.E.D. 

Definition. A temporary equilibrium at date 1 (p 1 , (z* 1 )) is a price />' 
and a trade allocation (z, 1 ) such that: 

(i) there exist (z ?) such that (zf,, zj 2 , z, 2 ) is a trade plan at date 1, 
given p l , for each i; 

00 zJi = 0, i.e., spot markets clear, 

z] 2 — 0, i.e., forward markets clear. 

Definition. A temporary equilibrium at date 2 (p 2 , (z 2 )), given 
(x (1 ), (zj a ), is a price p 2 and a trade allocation (z 2 ) such that: 

(i) z ( 2 is a trade plan at date 2, given x lX , zj 2 , p 2 , for each i, 

00 z 4 2 — 0, i.e., spot markets clear. 

Definition. A sequence of temporary equilibria (p l , (z, 1 ), p 2 , (z?)) is 
a temporary equilibrium (p 1 , (z, 1 )) at date 1, and a temporary equilibrium 
(p 2 , (z^)) at date 2 where (x n ) and (z tl ) are given from the temporary 
equilibrium at date 1. (jc ( ) — ((x a , x (2 + zf 2 + z 2 )) is called the equi¬ 
librium allocation. 

The next two theorems show that for each Arrow-Debreu equilibrium 
there exists a sequence of temporary equilibria such that expectations are 
fulfilled, and vice versa. 

Theorem 1. Let (p,z t )) be an Arrow-Debreu equilibrium. Then there 
exists a sequence of temporary equilibria (p 1 , (z^), p 2 , (z^)) such that 
p 1 = p, p 2 = Pt , and such that the equilibrium allocation is identical to 
the Arrow-Debreu allocation (x,). 

Proof. It is straightforward to show that (p, (z,-), p t , (0)) is a possible 
sequence of temporary equilibria, that has the same allocation as the 
Arrow-Debreu equilibrium. Q.E.D. 

Theorem 2. Let (p 1 , (z ( l ), p 2 , (z 2 )) be a sequence of temporary equilibria 
with the equilibrium allocation (x<), such that p 2 — \p 2 (A > 0). Then there 



SEQUENCES OF TEMPORARY EQUILIBRIA 


177 


exists an Arrow-Debreu equilibrium with prices equal to p 1 and with the 
same allocation as the sequence of temporary equilibria. 

Proof. If we let, for each i, 

Z = t}- 

it is easy to show that (p\ (z { )) is an Arrow-Debreu equilibrium with the 
same allocation as the sequence of the temporary equilibria. Q.E.D. 

Lemma 2. For all prices p 1 at date 1 such that the corresponding trade 
plans fulfill 

z\ 2 == 0, i.e., spot markets clear, 

there exists a temporary equilibrium at date 1. 

Since the requirement that only spot markets clear is very weak, we would 
expect that there normally exists an infinite number of different prices p 1 
such that spot markets clear. This lemma then implies that the number of 
temporary equilibria at date 1 with different prices is infinite, independent 
of the number of Arrow-Debreu equilibria. 

Proof. Let (z,\ , z } 2 , z, 2 )) be the trade plans at date 1, given jf. Suppose 
zj 2 =£ 0, else we already have a temporary equilibrium at date 1. Let 
z} 2 = z 2 and define 


z 1 * = 4 0=1 ./), 

Z 12 ~ Z lt _ Z 2 ’ 

Z \* = Z 1 + Z 2 * 

z ii = z )a 0 = 2 .*)> 

Z? = z* 0=2 . 7 ). 

Since u { is strictly increasing in each good, the trade plans at date 1 
will fulfill the constraints (2) with equality. Then for each i, we have 

Pi z n + P 2 2 )* = 0 , 

Pi I z 'a + Pi £ z « = °- 


and hence 



178 


LARS E. O. SVENSSON 


Since z} x = 0 by assumption, we get 

Pt 14 = Pih = °- 

i 

Hence the conditions for Lemma 1 are fulfilled, and it is clear that 
((z)*, z\ 2 , z?*)) are trade plans at date 1, given p 1 . Furthermore, 

I 4 = 4 ~ + I 4 = 14 - Z 2 = °» 

i i>2 i 

so (p\ zj*)) is a temporary equilibrium at date 1. Q.E.D. 

The next theorem shows that for each price p 1 at date 1 such that plans 
for consumption at date 1 are consistent, but plans for consumption at 
date 2 are inconsistent, there does exist a temporary equilibrium at date 1, 
but that there does not exist a sequence of temporary equilibria such that 
expectations are fulfilled. Normally, there would exist an infinite number 
of such prices p 1 . 

Theorem 3. Suppose there exist a price p 1 at date 1 with corresponding 
consumption and trade plans ((x 41 , x ia )), ((zj t , z, l 2 , z, 2 )), such that x tl - 
Si *<i, i e ; 4 = 0, but that x i2 =£ x <2 . Then either (i) there 

exists a sequence of temporary equilibria such that p 2 is not proportional 
to p^, or (ii) there does not exist a temporary equilibrium at date 2. 

Proof. From Lemma 2 it follows that there exist a temporary equi¬ 
librium (p\ (z< 1 *)), with possibly modified trade plans ((z 1 *, z] 2 , zf*)) 
but unchanged consumption plans (*,). 

At date 2, (x a ) and (z 1 *) are given. For prices p 2 = A pf (A >0), the trade 
plans at date 2 clearly are (z?*). But we have 

I Zj* = I (*„ - 4 - 4*) = I x it - £ 4 * 0 

i i i i 

by assumption. Thus there cannot exist a temporary equilibrium at date 2 
with prices p 2 — Xpf. 

The economy at date 2 is equivalent to a static exchange economy with 
new endowments ((x (s + z} 2 *)) and utility functions (u i2 (x it )) = 
(u 4 (x a , x 42 )). There are two possibilities: 

(i) There exists a temporary equilibrium at date 2 for some price 
p * not proportional to p t l , where for each consumer the value of his new 
endowments p*(x <2 + z} a *) is nonnegative. 

(ii) An equilibrium may fail to exist. Because of our assumptions 
on the utility functions, the only way an equilibrium may fail to exist 



SEQUENCES OF TEMPORARY EQUILIBRIA 


179 


is if the value of some consumer’s date 2 endowments becomes negative. 
Since we have not restricted forward trades to fulfill x„ + zj, > 0, this 
might very well happen. Q.E.D. 

9. If the value of some consumer’s new endowments at date 2 
becomes negative, the consumer goes bankrupt. The issue of bankruptcy 
has been dealt with by Green [7, 8], Stigum [18, 19], and Ledyard [12]. 
We could obviously have restricted trades on the forward markets at date 1 
so as to not allow short selling, and thus ensuring x (s + r[ s ^ 0 for each i. 
This way we could have assured the existence of a sequence of temporary 
equilibria whenever a temporary equilibrium at the first date exists. It is 
obvious that this restriction would not change our results about the likeli¬ 
hood of expectations being fulfilled, although the number of possible 
temporary equilibria at date 1 would have been somewhat restricted. 

10. So far we have established an equivalence relationship 
between sequences of temporary equilibria with fulfilled expectations and 
and Arrow-Debreu equilibria. We have also seen that normally sequences 
of temporary equilibria will not be such that expectations are fulfilled. 
Before relating these results to efficiency, we have to decide what concept 
of efficiency to use. In an Arrow-Debreu model with environmental 
uncertainty there is an important distinction, due to Starr [16], between 
ex post and ex ante efficiency. Since our model does not have any such 
uncertainty, this distinction is not directly relevant here. Given a temporary 
equilibrium at date 1 with corresponding consumption plans (x,-), we 
might try to apply some ex ante efficiency criterion to these. Since there 
is nothing to guarantee that these consumption plans are feasible with 
respect to endowments at date 2, this is not very meaningful, though. 
Given a specific temporary equilibrium at date 1, there will normally 
always exist another temporary equilibrium at date 1 with other consump¬ 
tion plans such that consumers would get higher utility levels if the plans 
were realized at date 2, i.e., the utility they expect to reach at date 2 
would be higher. 

The temporary equilibrium at date 2 will result in an equilibrium allo¬ 
cation which, if expectations are not fulfilled, normally will differ from the 
consumption plans at date 1. This equilibrium allocation will be Pareto 
efficient in a restricted sense, namely if date 1 consumption (x n ) and 
forward trade (z] 2 ) are fixed. But there usually would exist another 
feasible allocation of consumption at dates 1 and 2 such that some con¬ 
sumer would be better off and no consumer worse off. 

Efficiency may also be related to the information structure, and the 
number of markets available, especially with environmental uncertainty 



180 


LARS E. O. SVENSSON 


when all contingencies cannot be observed by all consumers, and when 
markets do not exist for all contingencies. This has been further dealt 
with by Radner [13], Diamond [4], and Hart [9]. In our case, however, 
there is no environmental uncertainty, (there is only one environmental 
state of the world to be observed), and the allocations that can be reached 
with deliveries in the Arrow-Debreu market structure are the same as 
those that can be reached with the overlapping sequential market structure. 
Hence these distinctions are not relevant to our model. For these reasons 
the interesting efficiency concept here is the ex post Pareto efficiency of 
the equilibrium allocation of consumption at dates 1 and 2, resulting from 
a sequence of temporary equilibria. 

11. Since a sequence of temporary equilibria with fulfilled expec¬ 
tations corresponds to an Arrow-Debreu equilibrium, and since an 
Arrow-Debreu equilibrium is Pareto efficient, fulfilled expectations are 
a sufficient condition for Pareto efficiency. Also, if trade plans at date 1 
((Zi 1 , r, s )) are such that £ z < 2 = 0, this is a sufficient condition for Pareto 
efficiency, if the temporary equilibrium at date 2 is unique. If it is not, 
the economy might still end up in another temporary equilibrium at date 2, 
giving an equilibrium allocation that may not be Pareto efficient. A third 
sufficient condition for Pareto efficiency is that trade plans at date 1 
are such that z ,* — 0 for all i. 

Our theorems show that none of these sufficient conditions are likely 
to be fulfilled. 

12. Having established that fulfilled expectations are a sufficient 
condition for Pareto efficiency, we might wonder whether they are a 
necessary condition for this. Numerical examples can be constructed 
that show that this is not the case. The equilibrium allocation might still 
be efficient in very special cases, even if expectations are violated. But 
it seems safe to conclude that in an overwhelming number of cases, 
violated expectations lead to nonefficient allocations. 


Conclusions 

13. Hence, stationary point expectations in an overlapping market 
structure do not ensure fulfilled expectations and efficiency. Instead most 
sequences of temporary equilibria would violate expectations and lead 
to inefficient equilibrium allocations. Hence the addition of a future 
spot market in the Arrow-Debreu framework is crucial, even with no 



SEQUENCES OF TEMPORARY EQUILIBRIA 


181 


environmental uncertainty and under assumptions that intuitively should 
be the most favorable ones for acquiring efficiency. 

14 . If we thought of this two-date experience of consumers as 
being repeated, and if consumers would learn, a situation with point 
expectations repeatedly being violated could not be sustained. We would 
tend to conclude that consumers would adopt some other form of expec¬ 
tations, or that some institutional changes in the market structure would 
take place. 

Since it can be shown that some consumer may get windfall gains 
when expectations are violated, an overall agreement to abolish date 2 
trading seems unlikely. Fulfilled expectations and efficiency would result, 
if consumers restricted themselves to zero trade at date 2. However, given 
their stationary point expectations, they are completely indifferent in the 
sense of Lemma 2 between forward trade at date 1 and spot trade at date 2, 
and there is hence no reason why they all would spontaneously choose 
to do all their trade forward. 

If date 2 prices were exogeneously fixed so as to be proportional to 
forward prices at date 1, disequilibria would tend to occur at date 2, 
and consumers would not be able to complete their desired transactions. 
If they then realize that transactions (but not prices) at date 2 are uncertain, 
they would do all their transactions on the forward market at date 1. 
Hence efficiency would result. But there is no reason why prices at date 2 
should be so exogenously fixed. 

15. If, given stationary point expectations, institutional changes 
in the market structure are unlikely, the other way out when expectations 
are not fulfilled is for consumers to adopt other forms of expectations. 
The obvious alternative to point expectations is expectations in the form 
of subjective probability distributions of prices at date 2. Consumers 
would then normally enter into both forward trade at date 1 and spot 
trade at date 2, and as we concluded in the introduction, there is no reason 
why efficiency generally would result. Special cases with infinite risk 
aversion might, however, lead to zero trade at the second date, prices 
at date 2 proportional to forward prices at date 1, and efficiency. Even 
limited risk aversion with special distribution expectations might lead 
to date 2 prices proportional to forward prices. But the observed outcome 
would then not be consistent with the subjective probability distributions. 

Given distribution expectations, no consumer is subjectively better 
off without date 2 markets, and some consumer might very well think 
he is worse off without these. Hence also in that case, there may not be 
any overall agreements to change the market structure. 



182 


LARS E. O. SVENSSON 


16. Jordan [10] has examined distribution expectations in a 
sequence of temporary equilibria using a model with production but with 
spot markets only. He gave sufficient conditions for the existence of 
fulfilled expectations in a weak sense, namely that observed spot prices 
are consistent with subjective distributions. We have shown that expec¬ 
tations in an overlapping market structure may be fulfilled in a stronger 
sense, when future spot prices are expected to be proportional to present 
forward prices, but that they normally would be violated. 

17. If we introduced transaction costs, these would reasonably 
be higher for forward trade than for spot trade, so this would rather 
increase trade at date 2 and not lead to efficiency. 

18. Our results are believed to be of interest also in a more general 
context than this specific model. Hart [9] has shown that if new markets 
are opened up in an incomplete market structure, the new allocation may 
be Pareto inferior to the old one, even when expectations are fulfilled. 
This may not be that surprising, though, since it is known from the theory 
of second best that if only some necessary conditions for efficiency are 
fulfilled, the restoration of further necessary conditions does not imply 
that consumers are better off. We have shown that the addition of future 
markets in an already complete Arrow Debreu market structure will lead 
to efficient allocations if expectations are fulfilled, but that there is every 
reason to believe that expectations will be violated and the resulting 
allocation will not be efficient. Thus we have a case in which fewer markets 
would, in a sense, be better, contradicting the general welfare theory result 
that competitive misallocations result from having too few markets. 
Especially, it may be the case that introducing forward markets in a 
situation with spot markets operating on each date may not improve 
allocational efficiency. Of course this has to do with the fact that the future 
spot market causes a crucial uncertainty about the future prices. In a 
sense then, the new set of markets is not complete, since all contingencies 
are not covered. We could conceive of an additional set of markets to 
cope with this new uncertainty; a set of markets for puts and calls, 
where contracts are contingent upon prices, as in Friesen [5] and Kura [11]. 
Realistically, this does not seem to be a promising solution. We are already 
sufficiently suspicious about assumptions that markets cover all environ¬ 
mental contingencies, not to speak of these price contingencies. Also, 
possibilities to conclude puts and calls contracts at several dates would in 
turn necessitate new puts and calls contracts on this first set of puts and 
calls, etc. 



SEQUENCES OF TEMPORARY. EQUILIBRIA 


183 


19. Finally, our results can be seen as nothing but an attempt to 
work out, in a most simple context, the idea, repeatedly put forward by 
Joan Robinsin, that if time is to be brought into economic models in an 
essential way, the models must incorporate the fact that ,k the past cannot 
be undone, and the future cannot be known.” This fact is essentially what 
we have to some extent managed to represent in our model with the 
sequential market structure, and which is not taken into account in the 
Arrow-Debreu market structure. 


References 

1. K. J. Arrow and F. H. Hahn, “General Competitive Analysis," Holden-Day, 
San Francisco, California, 1971. 

2. M. S. Balch, D. L. McFadden, and S. Y. Wu (Eds.), “Essays on Economic 
Behaviour under Uncertainty,” North-Holland, Amsterdam; American Elsevier, 
New York, 1974. 

3. G. Debreu, “Theory of Value,” Yale Univ. Press, 1959. 

4. P. A. Diamond, The role of a stock market in a general equilibrium model with 
technological uncertainty, Amer. Econ. Rev. 57 (1967), 759-776. 

5. P. M. Friesen, A reinterpretation of the equilibrium theory of Arrow and Debreu 
in terms of financial markets, I.M.S.S.S. Technical Report No. 126, Stanford 
University, 1974. 

6. J. R. Green, Temporary equilibrium in a sequential trading model with spot and 
futures transactions, Econometrica 41 (1973), 1103-1123. 

7. J. R. Green, Pre-existing contracts and temporary equilibrium, in [2, pp. 263-286]. 

8. J, R. Green, Reply to comments, in [2, pp. 296-299]. 

9. O. D. Hart, On the optimality of equilibrium when markets are incomplete, 
J. Econ. Theory 11 (1975), 413. 

10. J. S. Jordan, Temporary competitive equilibrium and the existence of self-fulfilling 
expectations. Discussion Paper No. 301, Univ. Pennsylvania. 

11. M. Kurz, The Kesten-Stigum model and the treatment of uncertainty in equilibrium 
theory, in [2, pp. 389-399], 

12. J. O. Ledyard, On sequences of temporary equilibrium, in [2, pp. 332-338]. 

13. R. Radner, Competitive equilibrium under uncertainty, Econometrica 36 (1968), 
31-58. 

14. R. Radner, Existence of equilibrium of plans, prices and price expectations in a 
sequence of markets, Econometrica 40 (1972), 298-303. 

15. M. Satto, Professor Debreu on “Theory of Value": a review article. Internal. 
Econ. Rev. 2 (1961), 231-237. 

16. R. M. Starr, Optimal production and allocation under uncertainty, Quarterly 
J. Econ. 87 (1973), 81-95. 

17. B. P. Stigum, Competitive equilibria under uncertainty, Quarterly J. Economics 83 
(1969), 533-561. 

18. B. P. Stioum, Competitive resource allocation over time under uncertainty, in 
(2, pp. 301-331]. 

19. B. P. Stigum, On pre-existing contracts and temporary equilibria, in [2, pp. 286-296]. 


6 4*/i3/a-a 



JOURNAL OF ECONOMIC THEORY 13, 184-192 (1976) 


An Advantage of the Bargaining Set over the Core 

Michael, Maschler 

Department of Mathematics, The Hebrew University, Jerusalem, Israel 
Received June 30, 1975 


An example of a market game is described for which the bargaining set seems 
to be intuitively more acceptable than the (nonempty) core. It also yields more 
insight into the nature of the competition that may exist among the traders. 


1. Introduction 

It is customary to argue that the bargaining set has an advantage over 
the core, because the core is empty in many cases, whereas the bargaining 
set never is (see Maschler and Davis [2] and Peleg [3]). 

But what about situations where the core is not empty and the bargaining 
set contains outcomes that are not in the core? 1 

The first inclination is to claim that in this case the core is a superior 
solution concept because it represents “safer” outcomes. Indeed an 
outcome certainly seems safer when there are no objections to it than when 
there are objections that can be countered. But this argument is not quite 
convincing; participants in a game may be willing to adopt outcomes that 
are less safe, if they yield higher payoffs. Nevertheless, until recently I was 
not aware of an example of a game with a nonempty core for which there 
is a heuristic argument to the effect that, in some sense, the bargaining 
set is more convincing. In this paper I would like to describe such an 
example, inspired by the recent note “Disadvantageous Syndicates” by 
Postlewaite and Rosenthal [4], 

Quite amusingly, [4] itself was a reaction to Aumann’s paper “Dis¬ 
advantageous Monopolies” [1]. In his paper Aumann shows that in certain 
exchange economies with a continuum of traders, it is to the “dis¬ 
advantage” of some players to syndicate themselves; i.e., to decide to 
act as a single trader (atom), even when syndication puts them in the role 
of a monopolist. The “disadvantage” lies in the fact that after syndication 
the core of the original game widens to include outcomes that are worse 

1 Of course, any outcome in the core must belong to the bargaining set. 

184 

Copyright © 1976 by Academic Preta, Inc. 

All rights of reproduction in any form reserved. 



ADVANTAGE OF THE BARGAINING SET 


185 


for the members of the monopolistic syndicate. In some of the examples 
only outcomes that are worse for the syndicate are added. Since this 
phenomenon is unintuitive and contradicts economic experience and 
theory, Aumann concludes that the core is not the proper solution concept 
for studying syndication. 

In [4], the authors produce a five-person exchange economy which 
exhibits a similar phenomenon. In this simple situation the authors feel 
that they can explain why the syndicated monopolist is at a disadvantage. 
They argue that the disadvantage results from perfectly acceptable 
economic considerations, and conclude that, at least in the finite case, 
the core should be “rehabilitated”. 

In this paper we argue that if one carries the analysis of [4] a little further, 
one may reach the conclusion that after all the core does not reflect the 
economic forces at work in that example and perhaps even there the 
disadvantageousness of syndication is not intuitively acceptable. Moreover, 
we shall show that the bargaining set reflects the economics of that 
situation better and may yield new insights on the nature of the competition 
between the traders. Specifically, we shall show that from the point of 
view of the bargaining set, monopolistic syndication is not disadvantageous 
in that example. 

We are dealing here with one example. Furthermore, we freely employ 
heuristic arguments—not mathematical statements. Obviously, no result 
of general validity can be deduced from such a procedure. In particular, 
we do not claim that the bargaining set is always superior to the core, nor 
do we claim that syndication is always worthwhile. We hope, however, 
to convince the reader that by studying various game theoretical solutions 
to the same conflict situation one may gain a deeper insight into the 
situation. In our example one gains insight into the phenomenon of 
competition. 


2. The Market Game 

We consider an economy consisting of five traders. Traders 1 and 2 
each initially hold one unit of commodity A, and traders 3, 4, and 5 each 
initially hold a units of commodity B. 

We assume that the commodities are completely complementary goods 
that are useful only in equal quantities. We also assume that from one 
unit of A and one unit of B one can produce an item that can be sold at 
a net profit of one unit of money (say SI.00). 

If P = {1,2}, Q — {3, 4, 5}, N = P u Q, this market can be represented 



18o K -rtAx A* u. . 

as a cooperative game (N; v) with side payments, whose characteristic 
function is: 

v(S) = Min[(Sn/M,fl|Sn<2|], allSCW. 

Here, | T ] denotes the number of traders in the set T. 

Such games are particular cases of the market games considered by 
Shapley and Shubik in [5]. Postlewaite and Rosenthal consider in [4] 
the case a = 

Example. Each of two manufacturers owns two machines that can 
be operated only by skilled workers. There are exactly three available 
skilled workers, each willing to work at most 8 hr a day. When a worker 
operates the machine during 8 hr he produces an item that can be sold at 
a net profit of J unit of money (say 8500). “Net profits’’ are computed here 
before paying wages to the workers. The resulting game is equivalent to 
the above game with a = \. Naturally, one would like to know what 
the wages should be. 


3. The Core 

If a = $, the core of the game consists of the unique point {(0,0, |, l)}. 

It is justified in [4] as follows: “there is an oversupply of commodity A, 
and therefore intense competition will develop between players 1 and 2 in 
the determination of terms of exchange with Q. ... this competition drives 
the payoff to 1 and 2 down to zero; any attempt by either player to get more 
will lead to the other one forming a coalition with two out of three 
members of Q and “underselling” him.” 

It seems to me that this reasoning should be approached cautiously. 
Denote {y} = P, {p, q, r} = Q. Whereas it is true that there is a threat 
to form a coalition {i,p, q}, it is not at all clear that this threat.will drive 
the payment to j down, and eventually down to zero. 

Note that j is not helpless: For one thing, he knows very well that 
without him all the rest of the traders will share together only 81.00. In 
order to reach the Pareto optimum total income of 81.50 they need his 
cooperation. Is it reasonable to expect that with this bargaining power he 
will feel obliged to cut his payoff down to zero ? 

Now, let us examine the process of “underselling” more closely. Start, 
for example, with a payoff of (i, ■£, $), which may result if traders 

1 and 2 put a price of ^ per unit of their good, expecting to sell equal 
amounts, i.e., f units each. Core theory tells us that this price (and 
outcome) is not stable. Indeed, if player 1 cuts his price to, say, 7/24, 
he may attract two customers, say 3 and 4, who will buy all his merchandise 



ADVANTAGE OF THE BARGAINING SET 


187 


with the resulting outcome (7/24,17/48, 17/48) to {1, 3,4}. But even if this 
threat is carried out successfully, traders 2 and 5 still can share 80.50. If 
I were trader 2,1 would look straight into the eyes of trader 5 and say that 
I raise my price to | per unit, on the ground that an outcome of (80.25, 
$0.25) looks fine to me. After all it is not my fault that trader 5 turned out 
to be excluded from the coalition. These considerations show that the 
arguments of [4] cited above, if valid, can be employed just as well to 
show that there is an “intense competition’’ among the members of Q, 
which drives their payments down. 

These considerations, applied to the manufacturers of the Example 
in Section 2 indicate that the manufacturers may act stupidly if they 
provide their own machines at no profit. Both common sense and perhaps 
also real life experience show that they will ask for substantial profits. 
Although there is an oversupply of machines there is no oversupply of 
manufacturers because each of them is sure to get services of at least 
one skilled worker. 

We must therefore conclude that although a threat to form (i, p, q) is 
possible, with i cutting the price, it does not determine who should lose, 
trader j or trader r, if the threat is carried out, and therefore who should 
cut his payment. This puts the intuitive justification in favor of the core 
in grave doubt. 

Another lesson that one should draw from these considerations is that 
it is not sufficient to consider threat capabilities. One also has to study 
how the traders can react when faced with such threats. Such con¬ 
siderations are in the spirit of the bargaining set. The bargaining set 
defines outcomes as stable if each “objection” can be “countered.” The 
intuitive rational of this notion simply means that cutting profits is not 
justified if a trader can protect his share even if he is faced with various 
threats. 

In the case a — J, for example, the core makes better sense. It consists 
again of a unique outcome: {(0, 0, £)}. For any outcome, say of the 

form (a, a, /S, /9, /S), 0 < at 3/16, 2at -f- 3/3 = 3/8, either {1, 3, 4, 5} 
or {2, 3,4, 5} can be employed, say by trader 3 who threaten to share 
(a f- e, f3 + e, /3 — e + (a/2), § — e + (a/2)), 0 < e < a/2. These threats 
cannot be countered by coalitions of the form {/, 4, 5}, whose worth is 
too small if, in addition, e < a + /? — 1/8. Thus, each member of Q 
can play off traders 1 and 2 against each other by threatening to give one 
of them some profits and causing actual losses to the other. This capability 
may drive the payments of 1 and 2 down to zero. 

Qualitatively speaking, the difference between the case a — \ and 
a ~ £ is as follows:. If a = A there is an oversupply of commodity A. 
This in itself is not, in my opinion, a valid reason to drive profits to its 



188 


MICHAEL MASCHLER 


owners down to zero. If a = $, there is an oversupply of traders in P. 
One member of Q is capable of employing all members of Q and a single 
member of P as partners in a solid threat. My contention is that, in contrast 
to oversupply of commodities, oversupply of traders drives prices down. 
This phenomenon will again be observed in a deeper sense in Section 4, 
when we study the more complicated case a — 5/12. 


4. The Bargaining Set \ N ) 

]f a — J, the bargaining set (for the grand coalition) coincides with the 
core. As explained in the previous section this outcome seems quite 
intuitive. 

If a = the bargaining set consists of every outcome that can be 
determined by assigning properly normalized prices to the commodities; 
namely, it is the straight line segment 

{(«, «, P, P, P): 0 < « < f, 2a + 3)3 = 1*}. 

It properly contains the core. 

Does this solution concept reflect the economics of the situation ? One 
may claim that now we are pushing too far towards the other extreme: 
If, say, x — (f, |, 0, 0, 0), to take an extreme case, and if a threat to form 
{1, 3, 4} is successful, then 2 must lose, because t>({2, 5}) = £ < J. The 
answer is that 2 can reply by employing (2, 3, 5} or {2, 4, 5}, which, by 
the way, destroys the original threat. Qualitatively speaking, if the threat 
to employ {1, 3, 4} is interpreted as an attempt by trader 3 to play off 
trader 1 against trader 2, {2, 4, 5} is an equally convincing attempt by 
trader 2 to play off trader 4 (or 5) against trader 3. Thus, even in this 
extreme case, no trader is helpless. 

A very interesting case occurs if $ < a < \ say, a — 5/T2. The bar¬ 
gaining set then consists of the interval 

{(a, a, p, P, P): 0 < a < 3/8, 2a + 3jS = 1J} 

whose end points are (0, 0, 5/12, 5/12, 5/12) and (3/8, 3/8, 1/6, 1/6, 1/6). 
Here, the bargaining set guarantees each member of Q a payoff of 1/6 
at least. What economic forces can account for such a result? A clue 
can be derived by examining the various objections and counter objections 
that determine the bargaining set: 

Consider a payoff 2 x = (a, a, p, p, p), a > 0, 2a + 3/3 = 1£ = v(N). 

* A formal proof concerning the structure of the bargaining set also requires treat¬ 
ment of nonsymmetric outcomes. Here we wish only to explain some of the underlying 
ideas. 



ADVANTAGE OF THE BARGAINING SET 


189 


Clearly at -f (3 > 5/12 = v({i, p})\ therefore, coalition {i,p} cannot be 
employed for threats. Since v({i, p, q}) — 5/6, if {i, p, q } is employed in a 
threat its excess gain is (5/6) — at — 2/8. Similarly, the excess gain of a 
threatening coalition {/, p,q,r) is 1 — * — 3/S. 

If/S < 1/6, 1 - <x - 3/3 > (5/6) - a - 2/3. 

If j8 > 1/6, (5/6) - a - 2/8 > 1 - a - 3/8. 

Thus, if /3 < i, the coalitions {/,/), r} are the ones that should be used 
for threats. The coalitions {i, p, q } are unattractive because they yield low 
gains—too small even to be used as counter threats. Thus, if /8 < the 
situation is similar to the case a — 1 and 2 compete on offers by all 

members of Q to join Q, where each member of Q has also the power to 
cause actual losses to any one trader of P by refusing to cooperate in any 
coalition containing this trader. This competition drives the payments to 
the members of P down — but not down to zero. As soon as j8 becomes 
the capability of a single member of Q to cause losses to members of P 

vanishes. If /3 ^ the coalitions {i, p, q) can be employed as counter 

threats. In fact, if /S > J, no trader is interested in {/, p, q, r} because the 
three-person coalitions yield a higher excess gain. As explained in the case 
a — £ in this case there is no force that drives payments down. To put 
it in other words: If /3 < there is an oversupply of traders in P in the 
sense that one of them can cooperate with all members of Q in a way that 
renders the other trader helpless. If /3 > $, although there is an oversupply 
of commodity A, profits should not be cut down because no player in P 
is really helpless when faced with a threat. 

We close this section by describing the core # and the bargaining set 
for all parameters a. The proofs are straightforward but long; 
therefore we shall content ourselves with proving one case in the Appendix 
as an illustration of the technique. 


A. 

V/ 

0 

<+-( 

< b-*?' 

= v = {(0, 0, a, a, a)}. 



B. 

If * <fl 

= 

{(0, 0, a, a, a)} and 




jyti) 

vFI j 

= {(a, a, ^ 

j9, p): 0 < « < (9a - 3)/2, 2* + 3/3 = 

3a}. 

C. 

If* <a 

< 2/3, V = 

: {(«i. a a. Ai - Pt > fit,)- a < ot t + < 3a 

-1, 



0 < 

. 0 ^ /3„ , ot x -f- a 4 + 

+ Pi + & = 

= 3a, 


ieP,p,qeQ,p ^ q). 

Note that W contains outcomes of “unequal treatment.” 

ur{° = ^U {(a, a, /3, /3, /9): 0 < a < 3a/2, 2 <x + 3/3 = 3a}. 
The points of outside # have the equal treatment property. 



190 


MICHAEL MASCHLER 


D. If|<fl<l, 


, a 2 , ft , ft , ft): a a< + ft, < I, 0 < ot* , 

0 < ft, <*! + a* + & + ft + ft = 2, i e P, p e Q), 

E. If 1 < a < co, V =je[ n = {(1, 1, 0, 0,0)}. 


5. Monopolistic Syndication 

If all members of Q decide to act as one syndicate, the game is changed 
into a three person game. The bargaining set becomes 

A. {(0,0, 3a)}if0 

B, C. {(<*!, a 2 , ip): 0 < t*i < 3a — 1, 0 < a 2 ^ 3a — 1, 

“i + “2 + 3p — 3a} if J < a ^ 3, 

D, E. {(a x , a 2 , 3p): 0 ^ a x 1, 0 ^ a 2 ^ 1, a x + a 2 + 3J3 = 2} 

if a> l 

Comparing this case with the unsyndicated one we observe that in all cases, 
the range of payoffs in the bargaining set to the members of P after 
syndication either diminishes in the high payments or expands in the low 
payments or remains unchanged; consequently, after syndication the 
members of Q may obtain higher sums in the bargaining set and never 
lower sums. Thus, from the point of view of the bargaining set, syndication 
of members of Q is never disadvantageous to them. Essentially, the reason 
is that after syndication, members of P cannot maintain profits by playing 
one member of Q against another. 

Similar computations show that syndication of the members of P is not 
disadvantageous. 


Appendix 

We shall prove that if J < a < $, the bargaining set is given set by B.. 
Section 4. The other cases are proved in a similar fashion. 

Let x = (a x , a 2 , ft, p t , ft) be a payoff in *#[ i] (N). Without loss of 
generality we may assume that a, < a 2 , ft < ft < ft . We shall show 
that a x = a 2 and ft = ft = ft. Suppose this is not the case then either 
a 2 or ft < ft. 

(i) If «j + ft < a and ft > a then ({I, 3}, (a — ft — e, ft + «)) 
is a justified objection at x of trader 3 either against trader 2 or against 



ADVANTAGE OF THE BARGAINING SET 


191 


trader 5, provided that 0<*<A x ^a — a* — 0,. (Trader 2 [5] can 
counter object only if a, = 0 [ 0 S = 0 ].) 

(ii) If <xj + 0s <~ a > fit ^ a then atj + 0s ** ~* ®i — 0s — 0s> 

3 a — a — a = a = t>({2, 5}). 

Consequently, ({1,3,4}, (a - 0 8 — 2«, 0 8 + «> a + «)) is a justified objec¬ 
tion at x of trader 3 either against trader 2 or against trader 5, provided that 
0 < 2 « < . 

(iii) If % + 0 8 ^ a then a g + 0 B > a and consequently o£j + 0 8 + 

jS 4 < 2a = f((l. 3, 4}). In this case ({1, 3, 4}, + e, 0 8 + e, 

2a — a x — 0 8 — 2 e)) is a justified objection at x of trader 3 either against 
trader 2 or against trader 5, provided that 0 < 2e < d 8 =s 2o — — 

0 8 — 04 an( l 2e < (a 2 — a,) + (0 B — j 8 g). Thus = a 8 = a, 0 8 = 0 4 — 
jgj = 0 , which implies « + 0 > a, unless a = 0 , in which case 0 = a. 

(iv) If a > (9a — 3)/2 then 0 < 1 — 2a. In this case, 1 — a — 30 = 

1 - a - 3a + 2a = 1 - 3a + a > 1 — 3a + (9a - 3)/2 = (3a - l)/2 ^0. 

Thus, (1 - a - y3)/2 > JS and ({1, 3,4, 5}, (« + e, 0 + «, (1 - « - 0 - 2«)/2, 
(1 — a — 0 — 2e)/2) is a justified objection of trader 3 against trader 2, 
provided that 0 <f<J a = (l—a — 0)/2 — 0 and 2 c < 1 — 0 — 2 a. 

It remains to show that x — (a, a, 0,0,0) belongs to .^’(iV) if 
a < (9a — 3)/2. Obviously any objection of a member of P[Q] against 
another member of P[Q\ can be countered by the subject to objection 
taking the same partners of the objector and offering them the same 
payoff. 

Let, again, {i, /} = {1, 2), {p, q, r } = {3, 4, 5}. An objection of i against 
p cannot be justified unless also j is partner in the objection, since otherwise 
p could use j and all the partners of i but one to form a counter objection. 
Similarly an objection of p against i cannot be justified unless q and r 
are also partners in the objection. 

Now, {1, 2, q) or {1, 2, q, r } cannot be used for an objection of i against p, 
because oc + 0 > a. There remains to consider only an objection of p 
against i employing the coalition {j,p,q,r}. In this objection q and r 
will receive together less than v({j,p, q, r}) — a — 0 = 1 — a — 0 . 
But then / will be able to counter object by employing the coalition 
{i, q , r }, because he will be able to offer q and r together the amount 
'>({*, q, r}) — a = 2a — a which is not less than 1 — a — 0. Thus, evepf 
objection can be countered and this completes the proof. 

Acknowledgment 

I wish to express my thanks and indebtedness to Robert J. Aumann for long con¬ 
versations concerning the present paper and many illuminating comments, as well as 



192 


MICHAEL MASCHLER 


the suggestion that this note may interest many readers. I am also indebted to Bezalei 
Peleg for many conversations and suggestions concerning the validity of the arguments 
that are employed here. 


References 

1. R. J. Aumann, Disadvantageous monopolies, J. Econ. Theory 6 (1973), pp. 1-1 1. 

2. M. Maschler and M. Davis, Existence of stable payoff configurations for coopera¬ 
tive games, in “Essays in Mathematical Economics in Honor of Oskar Morgenstern,” 
(M. Shubik Ed.), pp. 39-52. Princeton Univ. Press, Princeton, N.J., 1967. 

3. B. Peleg, Existence theorem for the bargaining set uf',* 1 , in “Essays in Mathematical 
Economics in Honor of Oskar Morgenstern," (M. Shubik Ed.), pp. 53-56. Princeton 
Univ. Press, Princeton, N.J., 1967. 

4. A. Postlewaite and R. W. Rosenthal, Disadvantageous syndicates, J . Econ. 
Theory 9 (1974), 324-326. 

5. L. Shapley and M. Shubik, On market games, J. Econ. Theory 1 (1969), 9-25. 



JOURNAL OF ECONOMIC THEORY 13, 193-200 (1976) 


A Proof Technique for Social Choice with Variable Electorate 

Bengt Hansson 


Department of Philosophy, Lunds Vniversitet, Kungshuset, Lundagard, 
S-223 SO Lund, Sweden 

AND 

Henrik Sahlquist 

Department of Philosophy, Stanford University, Stanford, California 94305 
Received September 29, 1975; revised February 28,1976 


A recent development in the theory of social choice is the study of 
aggregation rules that are defined for a variable electorate. In such a 
context a new type of conditions can be defined, and one of the most 
interesting results in this area is due to H. P. Young, [1], He characterizes 
the Borda rule by a set of natural conditions, all formulated in the standard 
notation of elementary set theory and relational logic usually employed 
in this field. His proof, however, is fairly complicated, requiring knowledge 
of at least the rudiments of linear algebra and graph theory. 

While there is no doubt that the well-developed notational and con¬ 
ceptual apparatus of linear algebra can be of great help in social choice 
theory, we nevertheless think it is interesting to show that Young’s result 
does not depend on linear algebra in any deeper sense and that a simple 
combinatorial proof can be given. At the same time we think that our 
construction of what we propose to call the amplification of a situation has 
some independent interest and that it will bring out the essential features 
of Young’s use of his so-called consistency condition—his condition about 
group additivity. 

Let A be the set of alternatives, and let x, y, z range over A. The variable i 
will range over individuals or voters. The basis for the social choice will 
be a preference profile, election return, or as we shall call it, a situation, 
a, b, c, etc., will range over situations. Think of them as functions from the 
set of voters involved into the set of linear preference relations over A\ 
or if we have ordered the voters linearly, as a sequence of preference 
relations, one for each voter. 

„ . 193 

Copyright © J976 by Academic Press, Inc. 

AH rights of reproduction in any form reserved. 



194 


HANSSON AND SAHLQU1ST 


A social choice function f assigns a nonempty set of aiteraatives, the 
choice set, to each situation. Unlike what has been common in the Arrow- 
tradition, the social choice function is not just defined for situations over 
a given set of individuals, but for all situations on arbitrarily large, but 
finite sets of voters. The set of alternatives is assumed to stay fixed. 

The Neutrality condition says that no importance is attached to the 
labeling of alternatives; if we permute the alternatives in a situation, the 
alternatives will be permuted the same way in the social choice (which can 
be considered as a partitioning of A into winners and losers). 

A social choice function satisfies Cancellation if, whenever it simul¬ 
taneously happens, for all x and y, that the number of voters preferring 
x to y equals the number preferring y to x, then the choice set is the set 
of all alternatives. For scoring functions, this condition is tantamount 
to requiring that the steps be equal. 

A social choice function is Faithful if, when there is only one voter, that 
society chooses his or her topranked alternative. 

When two situations a and b have disjoint sets of voters we can combine 
them to form the situation a + b which has as voters the union of the two 
voter sets, and agrees with a on a 's set of voters and with b on the other. 
If you think of a and b as functions from voter sets, a + b is just their 
set-theoretical union. 

A social choice function /is said to be Consistent if, whenever a + b 
is defined, 

f(a) Cif(b) f 0 implies f(a ) nf(b) = f(a + b). 

(For a motivation of this condition, see Young’s paper.) 

The Borda choice function works by assigning a number, the Borda-score, 
to each alternative x according to the rule: sum over all individuals i 
the number of alternatives i prefers x to minus the number preferred to x. 
Then it chooses the alternatives with the highest assigned number. 

The four conditions above are the ones that characterize the Borda 
function. It is easily checked that it satisfies the conditions; we will give 
a proof that the Borda function is the only one satisfying them. 

Another condition the Borda function satisfies is Anonymity, the social 
choice does not depend on which individuals have the different preferences 
in the situation. Since the set of voters may vary from one situation to 
another, we cannot define Anonymity as is usually done, in terms of 
permutations of individuals. We must say that if two sets of voters have 
the same preference relations, then the social choice should be the same 
in both cases, i.e., if the number of individuals having any given preference 
relation is the same in both situations, then the social choice is the same. 

Young’s proof has three main steps: 



PROOF TECHNIQUE FOR SOCIAL CHOICE 


195 


When / satisfies the four conditions, 

A. f depends only on the net number of voters preferring x to y, 

for all x and y, 1 

B. / depends only on the Borda-scores, 

C. / depends on the Borda-scores in the right way, picking the 
alternatives with maximal Borda-score. 

Since A follows from B and B can just as easily be proved without A, 
we will only do B. 

Before we proceed with the main proof we will introduce the notion 
of an amplification of a situation and define the concept of Borda-score. 
The former is a tool for investigating the relative position of a subset of 
alternatives by “wiping out” the disturbances of all other alternatives. 
We do this by creating new situations (with disjoint sets of voters) where 
we keep the positions constant for those alternatives we are interested in, 
but permute the other ones. If we create new such situations for all 
possible permutations and add all the situations thus obtained we get a 
grand new situation where, on the average, all differences between alter¬ 
natives that we are not interested in are wiped out, whereas the relationship 
between the interesting alternatives has just been multiplied by a certain 
factor. The important thing about the Consistency condition is that it 
enables us to relate the choice sets from the original and amplified 
situations. 

To be more precise, let be the set of all permutations p of alter¬ 
natives in a, such that the alternatives x, y,... are kept fixed. Let p(a) be 
the situation we get by permuting the alternatives according to p. We 
want to form £/?(a), where the sum is taken over all p's in , and 

call this situation a rv _, the amplification of a with respect to x, y,.... 

Let, e.g., the set of alternatives be {x, y, z, w) and the situation a have 
two individuals, 1 and 2, with the following preferences: 

1 . y,z,x,w, 

2. x, w, z, y. 

If we keep the positions of x fixed we will get the following scheme, where 
the dashes denote empty spaces: 



In order to construe the amplification a x ' we must fill in all these empty 
spaces with y’s, z’s, and w’s in all possible combinations. Since these 



196 


HANSSON AND SAHLQU1ST 


three alternatives can be permuted in six different ways, a x ' will have six 
groups of two individuals, i.e., twelve individuals: 


1. 

y< 

Z, 

X, 

W, 

2. 

X, 

y, 

Z, 

W, 

3. 

y . 

w, 

X, 

Z, 

4. 

x, 

y* 

w, 

Z, 

5. 

z. 

y . 

X, 

W, 

6. 

X, 

Z, 

y. 

W, 

7. 

Z , 

W, 

X, 


8. 

x. 

Z, 

W, 

y, 

9. 

W, 

y, 

X, 

Z, 

10. 

X, 

W, 

y> 

Z, 

11. 

W, 

Z, 

X, 

y, 

12. 

X , 

W, 

Z, 

y■ 


Of course, a sum of situations is only well defined when the voter sets 
are disjoint. To assure that, we pick new voters whenever needed, and when 
we want to add, say, p x {a ), we actually add an isomorphic copy of it over 
a new set of voters. For definiteness, we fix on one ordering of the permu¬ 
tations and one of the individuals, and build up a ' xy ... by starting with 
Pi ( a ), adding a copy of p 2 (a) over the first unused voters, etc. 

We don’t yet know that f(a' xv ) is independent of the orderings chosen, 
but that turns out to be irrelevant for our proof. Moreover, A below will 
imply that / is anonymous, so isomorphic situations will have the same 
choice set. Thus f(a' ay ) is independent of the choice of orderings. 

Once we have proved Anonymity we can use Consistency more freely. 
Granted an infinite stock of voters we can always find isomorphic 
situations on new voters, so we can just as well talk directly about a + b 
or multiples of a even if the voter sets are not disjoint. 

To define the Borda-score, let us first introduce the notation f3 xi (a), 
the contribution from individual i to the Borda-score of alternative x 
in situation a. fi xi (a) is defined to be the number of alternatives individual i 
considers x strictly preferred to minus the number he strictly prefers to x 
in situation a. The Borda-score of x in a, fi x (a), is then defined as 
Since already, for a fixed individual, the sum of the contributions to the 
Borda-scores of the different alternatives adds up to zero, the sum of all 
the Borda-scores in a given situation must be zero too. This will be a 
useful fact in later calculations. 

The Borda-scores in an amplified situation are easily calculated. If n 
is the number of alternatives, there are (n — 1)1 permutations in a T '- 
Since x has the same Borda-score in each permutation, we directly obtain 



PROOF TECHNIQUE FOR SOCIAL CHOICE 


197 


fifaf) = (« — !)!■ $»(<*)• For all other alternatives the situation is 
completely symmetrical, and therefore they must have the same Borda- 
scorcs among themselves. Since the sum of the Borda-soores is zero we 
get fi v (a x ') = —(n — 2)! • fi x (a) for each y different from x. The important 
thing about Borda-scores for amplified situations is that they are multiples 
of j3 J^a), whereas the size of the factor is rather inessential. 

The following fact about the choice set of amplified situations shows 
how amplification uses Consistency: 

Lemma 1. If xsf(a) =£ A, then / (af) — {*}. 

Proof. By Neutrality xef(p(a)) for all p that leave x invariant, so 
by Consistency f(af) is the intersection of all f(p(a)). But if y ^ x 
there must be some p such that y $f(p(a)), so y £f(aj). Therefore 
/(O = {*}. 

Proof of B. We want to show that / depends on the Borda-scores 
only, i.e., that if fi x (a) = fijb) for ail x, then f(a) — fib). As a first step 
we will show that/(a) = A whenever jS fa) — 0 for all x. This assumption 
means that in all the pairwise comparisons the individuals have made in 
the situation a, any x has won just as many times as it has lost. Fix an 
x and look at the amplification aj. Since all permutations of a (holding 
x fixed) are in af any particular y will occupy the positions of all z’s in a 
the same number of times and it will therefore beat x in af the same 
number of times as it loses. Because of the complete symmetry this will 
also hold for alternatives different from x in aj and the requirements in 
the cancellation condition are thus fulfilled and we conclude that 
fiaf) — A. Then we must have f(a) — A too, for if this were not the case, 
Lemma 1 would be applicable and yield f(a x ’) = {jc}. 

To prove that the fl x (a) always uniquely determine f (a) we temporarily 
introduce the notation d for a situation where each individual has the 
opposite preferences as compared with a. We also assume that d is defined 
on a voter set disjoint from all others in the same context. Strictly speaking, 
this leaves d not quite well defined, but the ambiguity of the voter set does 
not matter in the present proof and is seen to be irrelevant when it is 
complete. 

If fif a) — fi x (b) for all x, then the first part of the proof is applicable 
both to a + d and b + d. If a and b have disjoint voter sets we therefore 
have: 

f(a) = f(a) n A =f(a)nf(d + b) = f{a + d + b) 

= fid + d) nf(b) = A nf(b) = f{b), 
where the third and fourth equality sign hold by Consistency. 



198 


HANSSON AND SAHLQUBT 


If a and b do not have disjoint voter sets, we introduce an isomorphic 
copy, c, of a on a new set of voters and apply the foregoing first to a and c 
and then to c and b. This completes the proof that/only depends on the 
Borda scores. As a consequence of this we see that the anonymity con¬ 
dition must be satisfied, which solves our earlier problems about well- 
definedness of a x and &. 

Before we proceed with the proof of part C, we need the following 
lemmas. 

Lemma 2. If ft fa) < 0, then x if (a). 

Proof. Suppose fifa) < 0. In a situation with one voter having * 
at the bottom and the other alternatives in some arbitrary order and one 
voter having x at the bottom still, but all the other alternatives in reverse 
order, the Borda-score of x would be —2(n — 1), and those of all other 
alternatives would be the same among themselves. If therefore d is a 
multiple by — \(n — 2 )! • ft x (a) of this situation, d and af have exactly 
the same Borda-scores and f(d) = f (af) by part B. But f(A) = M by 
Faithfulness and Consistency and f(d + J) = A by Cancellation, so 
/ (d) and f(d) must be disjoint by Consistency. Therefore xif(d) — f(af). 
Then x cannot be in f(a) either, for by Lemma 1 it would then be in / (a x r ) 
unless f(a) — A. But if this were the case f(aj) would be A too, by 
Consistency, which is false. 

Lemma 3. If fifa) = 0, then x ef(a) only if f) v (a) = 0 for all y ( and 
then /(«) = A). 

Proof. Suppose fija) — 0,xe /(a) and fl v (a) ^ 0 for some y. Then 
the Borda-score of some alternative is negative (since their sum is zero) 
and f(a) ^ A by Lemma 2. Then Lemma 1 can be applied and it yields 
f(a x ') = {*•}, which contradicts the first part of the proof of B since 
Pfa/) = 0 for all z. 

Proof of part C. We want to show that / depends on the Borda-scores 
in the right way, i.e., chooses the alternatives with greatest Borda-score. 
We begin with one half of this: if y has greater Borda-score than x in 
situation a, then x is not in/(a). 

The general idea of the proof is to find a situation d with the same Borda- 
scores for x and y as a, but construed in such a way that we can inde¬ 
pendently conclude that it does not have x in its choice set. It is true 
that we cannot apply part B above immediately and conclude that f(a) 
and f(d) are the same, for the Borda-scores might be different for some z 
different from x and y; but we can wipe out such differences by using the 



PROOF TECHNIQUE FOR SOCIAL CHOICE 


199 


amplified situations a' n and d' xy instead, in a way to be explained below. 
The logical structure of the proof will thus be to assume that xef(a), 
to prove that it then has to be in f(a' xy ) too, which must be the same as 
f(d' xy ), but that x cannot be in this latter set because of the construction 
of d. 

Suppose therefore that y has greater Borda-score than x and that x 
is in f(a). We will construe d in two parts—one that takes care of the 
difference in Borda-score between y and x and one that adjusts the scores 
to the right absolute level. The difference between two Borda-seores is 
always an even number, so the smallest possible difference is 2, which it 
is when y is immediately above x. We therefore start with an ordering such 
that x is in the middle if the number of alternatives is odd and immediately 
below the middle if the number is even and with y immediately above x. 
In such an ordering the Borda-scores for x and y are 0 and 2 in the first 
case and —1 and +1 in the second case. Since j3„(a) — /J x (a) is even, we 
need only take a sufficient number of such orderings to have a situation 
with the same difference. Call this situation b. 

We now turn to the second part of d, the part that is to adjust the scores 
to the right absolute level. In the case of an odd number of alternatives 
we consider an individual who has x in the middle position and y 
immediately above and the other alternatives in an arbitrary order. In 
the even case he shall have x immediately above the middle (at the bottom 
of the upper half) and y immediately above x. Then, in both cases, add 
another individual who has swapped the positions of x and y. In this group 
of two, x and y will have the same Borda-scores, viz. 2 in the odd case and 
4 in the even. Let c be the multiple of this group which, when added to b, 
yields the same Borda-score for x as a. (In the even case it may happen 
that the required increment in Borda-score for x is not divisible by 4. 
If so, we start with a multiple of situation a instead. By Consistency it has 
the same choice set as a.) 

If we now define d as b + c we know that a and d have the same Borda- 
scores for x and y. Any differences for other alternatives disappear when 
we go to a' xy and d’ xv instead, and by part B we conclude / (a' xy ) — f(d' xy ). 
x was supposed to be in / (a) and thus in / (a' my ) too. If we can show, by 
construction, that x is not in / (d^), we will have our contradiction. 

d'xy =(b + c)’ xv = b’ xy + c' xv . By Lemmas 2 and 3 f(b' xy ) = {>’}, 
because y is the only alternative with a positive Borda-score in b' xy . In 
c' xy , x and y are the only alternatives with positive Borda-scores and there¬ 
fore the only candidates for the choice set. But Neutrality yields, because 
of the symmetry, that one cannot be in it without the other. Therefore 
f ( c ' x „) — {x,y}. The consistency condition then yields f(d' my ) = f(b' xy ) r\ 
f( c ' n ) — {>>}, which contradicts the above. 


6 4J/i3/3-3 



200 


HANSSON AND SAHLQUIST 


We have now proved the first part of C, viz. that only alternatives with 
maximal Borda-score can be in the choice set. It remains to prove that 
all such alternatives are in the choice set. Let therefore x be such an alter¬ 
native, and let y be an alternative in f(a). By the first part of this proof y 
has a maximal score and j8 v (a) — fi z {a). If we try to construe a situation d 
like in the first part of this proof, we will find that b is not needed since 
it was intended to take care of the difference in Borda-score. Therefore a 
and c have the same Borda-scores for x and y and a xv and c'^ the same 
scores for all alternatives, from which follows = /(c^) = {x, y}. 

And if x is in f(a' mv ) it must be in f(a) too, by a simple extension of 
Lemma 1. This concludes the proof of part C. 

Exercise. What simplifications can be made in the proof of C if one, 
instead of linear orders, considers (a) preorders or so-called weak orders 
(transitive and strongly connected), (b) arbitrary relations? 


Reference 


1. H. P. Young, An axiomatization of Borda’s rule, J. Econ. Theory 9 (1974), 43-52. 



JOURNAL Of ECONOMIC THEORY 13, 201-216 (1976) 


Asymptotic Stability of Stationary Temporary Equilibria 
and Changes in Expectations* 

Gerard Fuchs 

Laboratoire d' Fxonometrie de 1't.cole Polytechnique, 

17 rue Descartes, 75005 Paris, France 

Received September 8, 1975 


Recent results on sequences of temporary equilibria have given conditions 
for local asymptotic stability of stationary temporary equilibria. Simultaneously 
these conditions entail that stability is preserved under sufficiently small per¬ 
turbations in the expectations of the agents. The present paper gives an evalua¬ 
tion of how large these perturbations can in fact be without the stability being 
destroyed. It also discusses the way this evaluation depends on the character¬ 
istics of the economy. 


Introduction 

A first series of results on the dynamics of sequences of temporary 
equilibria, in a model derived from Samuelson’s pure consumption loan 
model [1], has recently been given (see [2]). In particular, for economies 
belonging to some admissible (open-dense) set, a sufficient condition for 
local asymptotic stability of stationary temporary equilibria (S.T.E.) was 
exhibited; it was then proven that S.T.E. satisfying this condition were 
locally structurally stable, i.e., kept their nature of attractors under 
sufficiently small perturbations in the data of the enconomy, in particular, 
the price expectations of the agents. 

The aim of this paper is to extend this last result by giving an answer to 
the question: how much can the agents actually move their expectations 
without destroying the stability of a given S.T.E.? More precisely we 
give an evaluation of the size of the ball (in the space of expectation 
functions) inside which the expectations of the agents can move, voluntarily 
or involuntarily, so that a given S.T.E. which is an attractor remains so. 

The analytical framework and notations are exactly the ones already 
used in [2]. They are rapidly recalled in Section 1, which also gives the 

* The author wishes to thank Jean-Michel Grandmont and Guy Laroque for helpful 
remarks. 


Copyright O 1976 by Academic Press, Inc. 

AH rights of reproduction in any form leterved. 


201 



202 


GERARD FUCHS 


precise setting of the problem. Section 2 presents and proves our 
evaluation. Section 3 discusses the way the evaluation varies with the 
characteristics of the economy. Section 4 then illustrates the previous 
results for the simple situation where traders (in particular, with demands 
deriving from logarithmic utility functions) exchange only one con¬ 
sumption good and money. Further comments and remarks are then 
presented in Conclusion. 

The reader who does not wish to bother with too much mathematics 
can get a flavor of the paper by looking at the beginning of Section 1 
(definitions and notations) and by going afterwards directly to Section 4. 


1. Settjng of the Problem 

We first recall without comments the definitions and notations of the 
model in [ 2 ]. 

There are 1 — 1 nonstorable goods and money. There are I consumers 
who live for two periods and receive an endowment in nonstorable goods 
at the beginning of each period of their lives. Consumers trade on the 
spot market of the corresponding period. Young consumers can save by 
buying from the old ones the money the latter have themselves bought dur¬ 
ing the previous period so that the total monetary stock remains constant. 

At time t the action of the young consumer (/l) depends on the current 
price system p, and on his expectation p\ +1 of the price system that will 
prevail at time t + 1. This expectation depends on the current prices and 
on the T past observed price systems. At time t the action of the old 
consumer (i 2 ) (born at time t — 1 ) depends on his past action, i.e., and 
pi, and on the present actual price system p,. The set of prices 1 is 
P — R+ X S. 

A consumer i is characterized by: ~ 

two demand functions 2 : z a e C 1 (P i , R l ) and z it e C\P 3 , R'), the first 
components of which describe demands for money, the last / — 1 com¬ 
ponents excess-demands for nonstorable goods; we suppose 8 that for any 
Pi-1 . Pt, Pi, P in P- 

(A.l) z^ipt, pUi) > 0 

(A.2) p t • z a (p t , pj +1 ) =0 p t - Zaipt-x, pi, p ( ) = 0 
(A.3) z) t (p t _ j, p t *, p t ) = ~z l n (p t _x , pi) 

1 R + =v]0, + cc[ is the set of prices for money, S = (pe R ,_l | = 1, p l > 0 

V fc) is the set of prices for goods. 

* CH.X, Y) is the space of continuously differentiable maps from X to Y. 

• • stands for Euclidian scalar product. 



ASYMPTOTIC STABILITY OF EQUILIBRIA 


203 


(A.4) z\(p, p) = (p 1 )- 1 z\ x ($, f>) 

4i(P,P) = 4i(P, ft 
4tiP.P.P) = 4t(ftftft 
(P = {l,p 2 ,...,p i y,k =2,...,/); 

a price expectation function tfi { e C 1 (/ >r+1 , R ! ) with: 

(A.5) P<(P T+1 ) C P 
(A. 6 ) i p t (p,...,p) = p. 

The space of agents A is the set of (z n , z ti , pt) (i = 1,..., I) such that 
(A.l) to (A. 6 ) hold. The excess demand function Z is the map from A to 
C 1 (i or+2 i R 1 ) defined through 

e A, e, = . . p t _ T ) € P T+1 , s t = ( p t , e t _j) e i >r+s ): 

4 4 

where Z(.s/; 5 ,) is the aggregated excess demand at time t for the agents si 
and the set of prices s t . 

An economy £ — {si, m) is then an element of E = A X R+ where m 
describes the total strictly positive monetary stock. 

The correspondence of temporary equilibria V(£; e t-1 ) is defined for any 
<f = (si, m) and any e ( _, with: M(si, e t ^) = £, 4i(Pt-i > <Me<-i)) = m, 
as the set of prices P t such that Z(si\ p,, e t _j) == 0 (from (A.3) then 
M(st, e t ) = M(si, e ( _0 = m). 

A future trajectory 4 is an infinite sequence of p t (t = 0, 1,...) with: 
p t eV(£; f H )W>r+l. 

The correspondence of stationary temporary equilibria W(£) is, for any 
£ = (si, m), the set of prices p such that: M(si ; e) = m and Z(si\ s) = 0 
(e = ( pp) e P T+1 , s = (p, e) e P T + Z ). 

Remark II Thanks to (A. 6 ), W(£) does not depend on the price 
expectations. 

We do not recall here the standard assumptions made in [2] to guarantee 
that V and W are not empty. 

The results of [2] can then be summarized as follows: 

There is an open dense subset Y in E (for the topology t, of C 1 uniform convergence) 

such that, given i in Y: 

(i) The set of admissible states of the economy: 

X ( i ) = {«6 A r+ ‘ | e ) = m ] 

is a C 1 submanifold of P T+1 with codimension 1; 


* This definition extends the one used in [2], where t was an element in Z. But we 
shall only be interested here in what happens for t -*■ + «. 



g£rart> FUCHS 


204 


(ii) W(£) is a finite set which varies continuously around £; 

(iii) Given any f> e W(f) there exists a neighbourhood V of l in X(X) such 
that for e<_, in V the equation Z {£\ p t , «•,_,) ■= 0 can be solved in p , by: p , = 
n{t\ e,.i) and the map <p(t, e\ •) defined by: 

9 , = e t . x ) 

9,-i = Pi-i (2) 


Qt-T — Pl-T 

is locally a diffeomorphism with fixed point e the action of which gives the evolution 
of the states after a unit time interval; 

(iv) Around «?, 9 is a continuously differentiable function of the economy; 

(v) Moreover f itself contains another open dense subset tr such that for 
in all e are hyperbolic' fixed points of <p and have their stability character preserved 
under sufficiently small perturbations of &. 

The set y is defined as the set of economies £ such that the following 
conditions hold (p being any element in W(&)): 

(C.O) For any e in X(£) the function e ) has at least one 

derivative which is not zero (implies i); 

(C.l) The Jacobian of Z\i\ p . p),..., Z '~'(£; p,...,p) with respect 

to p*,..., p l ~ l at point s has full rank / — 2 (implies ii); 

(C.2) The Jacobian Z ( of Z\S; s, ),..., Z' _1 (^; s t ) with respect to 
p t p\~ 1 at point S has full rank / — 1 (implies that Z(<f; s t ) = 0 can be 
solved in p,)\ 

(C.3) The Jacobian Z,_ r _, of ZH<f; j,),..„ Z l ~\£; j,) with respect to 
pLr-i Pt~r-i at point s has full rank / — 1 (implies that Z(£; s t ) can be 
solved in p t ~T-i)- 

The word “around” in (ii) and (iv) then means precisely; -in the same 
connected component of i' as £. Now as we shall only be interested in 
what happens around e, we shall forget about (C.O) 9 and, as we shall only 
consider future trajectories, we shall also forget about (C.3). 

Our starting point then will be Theorem 8 in [2] from which we derive 
the weaker but more obvious version: 

Proposition. Given an economy £ in i~, a sufficient condition for a 

1 A diffeomorphism <p from X to X is said to have x as an hyperbolic fixed point if 
<p(x) = x and if the Jacobian of 9 = at point x has no eigenvalue of modulus one (see 
[2] or [31), 

• Indeed one sees easily from (A.4) that (C.O) is always satisfied on the diagonal 
of P r+1 or, in other words, that X( f) is always a local manifold around any e. 




ASYMPTOTIC STABILITY OF EQUILIBRIA 


205 


S.T.E. p in W(S) to be locally asymptotically stable for the dynamics 
generated by ( 2 ) is that 1 : 

* < 

IK^-Mt *1'll £.11 = <*(<?, P) < l (3) 

(/is the map obtained from Z by substituting — mto£<zf*(/h_i, 
in Z 1 ; is then the Jacobian of 2\S\ s t ),..., Z l ~\S; s,) with respect to 
pj,..., plr 1 at point s (u = t, t — 1,..., t — T — 1); note that from (1) and 
(A.3) we have 2 t = Z t ) 7 8 . As noticed in [2], (3) implies hyperbolicity and 
local structural stability of e. 

We shall give at the beginning of Section 3 (Remark 5) an economic 
interpretation of (3). 

Remark 2. In fact, from [2], a (S, p) is an upper bound for the modulus 
of the eigenvalues of the Jacobian at point e of the map <p in (2). In other 
words, near e, the dynamics are a contraction with rate r(S, p) greater than 
or equal to 1 — *(<?, p). 

The problem we shall be interested in from now on is the following. 
We start from some S in and p e W(S) such that <x(S, p) < 1 so e is an 
attractor. Let us change S = (z a , z i2 , ip t , m) to S' = (z a , z ti , m). 

We know from Remark 1 that p e W(<2”) and, from (v), that if ifi t ' — i/> v is 
small enough p will remain an attractor. Calling B(<pi, 8) the open ball 
(for the C l uniform metric) in the space of expectation functions (subset of 
[(C 1 (P r+1 , R')] ; with (A.5) and (A. 6 ) for all (i)) with center {</«,} and radius 
S, we wish to find some S 0 > 0 such that, for any {<p/} in B(ip { , S 0 ), p is 
still a locally asymptotically stable S.T.E. 

To evaluate such a 8 (J we shall have to look at several different problems. 
Suppose tha,t in the space E we begin to leave £ in going along some 
definite path; suppose that our path is such that only the < fi { are changed. 
Then the stability character of e can “apparently” be altered at some point 
£' of the path for three distinct reasons: 

(1) S' falls out of because condition (C.l) is violated; but this 
cannot happen because (C.l) is independent of the </r ( (from (A.6)); 

(2) S' falls out of "V because (C.2) is violated; 

(3) at S' inequality (3) stops being verified. 


7 II • |( is the usual norm for linear operators between Euclidian spaces. 

8 As mentionned in [2], the introduction of 2. instead of Z is just a trick to consider 
derivatives along the space of states XtS) and forget about the derivative normal to 
X{S) in P T *\ 



206 


GERARD FUCHS 


Our number S 0 will then be defined as the radius of the ball in expectation 
space such that for any path inside this ball no event of type 2 or 3 occurs. 
This is meaningful because, thanks to (iv), (C.l), (C.2), and (3) depend 
continuously on the economy. 

A last remark about the word “apparently”: It is clear that, indepen¬ 
dently of the approximations we shall make in our calculations, such a 6 0 
is a minimum evaluation. First, of course, because (3) gives only a 
sufficient condition of stability. But also because one can enter a new 
connected component of i r without the stability of e being destroyed. 
In a more mathematical language: a bifurcation point may separate 
two regions where the dynamics are equivalent. 

We now turn to our evaluation problem. 


2. Admissible Changes in Expectations 

To make formulas more compact we first introduce a set of notations 
(all our Jacobians D ID- are (/ — 1) X (/ — 1) square matrices, using the 

/ — 1 first components of the vectors we consider; i = 1./; 

v = /,..., t-T). 

With dt/ii — ipi — tpi and z it = (0, z% ,..., z’ a ) we call: 


a — sup 


-§ 1^4 * = s “ p lf7 ( *M 


C = IKZ,)- 1 II h = III 2 V _, | 


x = sup 

t,v 


« 

Dp 




(4) 


l-o 


We then have: 


Theorem. Let & == (z a , z <2 , , m) be an economy in ’V andp a S.T.E. 

in W(€) such that a(d, p) s= he < 1. Then p is still a locally asymptotically 
stable equilibrium for any economy S' = {z a , z i2 , ip/, m) such that 
ifii ' e , 8 0 ) with: 

s 1 — he 

°° Ic(T + l)(a + b) • 


(5) 



ASYMPTOTIC STABILITY OF EQUILIBRIA 


207 


Proof. The proof goes in three steps: 

(a) We first deal with condition (C.2). Using primes to distinguish 
Jacobians associated with S' a short calculation gives us, from the precise 
form of Z ( : 

II Z t '-Z t I! < IafioX 

On the other hand standard algebraic identities lead to the inequality: 

l!(z/)-i || = iKZ.ii + (z t y\z t ' - z.)]}- 1 1 | < c(i - c || z t ' - z.n )- 1 

which is valid if: c || Z/ — Z t || < 1. 

In other words all S' with x < (elafi 0 y i are such that Z/ is invertible. 
As it is clear that, e being any positive number, || d<p t II 1 < € for any i 
implies x < e, we can conclude that all S' with expectations in the ball 
B(if>i, S x ) where = (c/a/9 0 ) -1 lie in the same connected component of Y" 
as S (recall we forget about (C.3 )). 8 

(b) Next we turn our attention to inequality (3). Using the precise 
form of the Jacobians of 2 we obtain (« = t — 1,..., t — T): 

|| 2j - 2 U || < I(ap t _ u + x 

II 2\t~i — 2t~T-i I! ^ IbfirX. 


With the same type of calculation as in (a) we thus have the series of 
inequalities: 


p) < c(l - c || Z t ' - Z ( ||) 


t-T 

1 II 

- V 


I*- 


t-T 

+ 11 

v-f 


- 2 v _ t iij 


< 


h + [s(a + b) — P 0 a) lx 
1 — cIoBqX 


=/(*) 


( 6 ) 


which are valid provided that: x < (cla^y 1 which, from (a), is satisfied 
if i p/ lies in J, S{). 

(c) Last we study the map f. It is clear from (6) that the graph of /in 
R 11 = {(x, y)} is a hyperbola with asymptotes x = (cla^y 1 and y = 
~ (a/3 9 ) _1 [s(a + b) — fi „a]. On the other hand /(0) = ch = ot[S,p) so/is 
a monotonically increasing function. A short calculation shows that it 
takes the value one for: 


. 1 - a(S, p) 

0 Ics(a 4- b) 


< (da?Jr* 


As clearly s < T + 1 we thus have our result. | 



II • (I 1 is the uniform C 1 norm. 



208 


GERARD FUCHS 


Remark 3. To any S' corresponds an open neighborhood £/(<?') of e in 
X{S') which is the basin of attraction of e (i.e., the set of e which are 
moved towards e under the dynamics associated with S'). Let p(S') be 
some measure of the size of this basin. In general, infp(#') for 
fa' e , S 0 ) will be zero because e stops being an attractor for fa' 
belonging to the boundary of B. It is however worthy to note that, for 
8 < S 0 , inf p(S') for fa' e B(ipi , 8) has a strictly positive value because 
local structural stability is an open notion. 

Remark 4. If we identify 1 — <*(<£', p) with the rate of convergence 
r(S', p) (see Remark 2) we see that formula (6) gives us an estimation of 
how this rate does vary with S'. In particular, for * small, we get at 
first order: 


I r(S',p) - r(S,p )i < 1{T + l)(a + b)x. 


3. Sensitivity to the Characteristics of the Economy 

This section will be devoted to the analysis of the way the radius 8 0 varies 
under changes in the characteristics of S — (z n , z t2 , fa , m) (this analysis 
does make sense because, as 'V is open, 8 0 is still a meaningful quantity for 
economies near S). 

We shall, however, only consider those changes in characteristics which 
leave the S.T.E. p invariant; otherwise second order derivatives of demands 
and expectations appear to be involved and, on the one hand, we have not 
assumed their existence, and on the other hand, this leads to effects which 
become very hardly tractable. Practically, this means that we shall leave 
the values of individual demands at the S.T.E. unaffected and consider 
only changes in their Jacobians and in the Jacobians of expectations 
(from (A.4) and (A. 6 ) these changes cannot, of course, be'completely 
arbitrary). 

Our first remark then concerns the interpretation of the four quantities 
a, b, c and h. Clearly a and b measure directly the sensitivity of indivudial 
agents to the future. The "interpretation of c is somewhat more subtle, 
because of course, in general, c |[ Z t || -1 ; in fact, a small c implies that all 
eigenvalues of Z ( are large, but a large c implies that Z t has a small 
eigenvalue only if || Z ( || is not large before the greatest eigenvalue of Z ,, 
i.e., if the asymetric part of Z t is not large before its symmetric part. 
However, this seems to be a reasonable economic assumption (it means the 
excess-demand for one good is not much more sensitive to the price of 
another good than to its own price), so we shall admit that c~ l measures 
the minimum over all price directions in P of the sensitivity of aggregated 



ASYMPTOTIC STABILITY OF EQUILIBRIA 


209 


excess demand to changes in the present prices. Lastly, there is no problem 
with h which clearly measures the total sensitivity of aggregated excess- 
demand to past prices. We do not claim, of course, that a, b, c, h are 
jndependant quantities, but we shall see in our discussion that there is not 
too much trouble with this point. 

Remark 5. Note that from these interpretations the inequality (3) for 
local asymptotic stability of some e (or, equivalently, for positiveness of 
the rate of convergence r(S, p)) in fact means that the sensitivity of 
aggregate excess demand to changes in past prices is smaller than its 
sensitivity to changes in any present price. 

We shall now study separately five dependances. 

(1) Our interest will first focus on the role of the number I of agents 
and on the significance of the factor (7 ) -1 in (5) (to leave p invariant 
we shall change m proportionally to 7 so that the individual stock of money 
of any consumer at the S.T.E. remains constant). A first conclusion is: 

Result 1. If all agents are alike S 0 does not depend any more on /; 
similarly, for “replica economies,” there is no “number of effect” on 8 „. 

Proof. Obvious from the fact that the 2 U are symmetric sums of 
individual terms. | 

Result 1 is just a partial consequence of the fact that price dynamics are 
left unchanged under any replica operation. 

An other interesting situation is obtained by considering two types of 
agents, 1 and 2 , with similar demands but expectations such that: 


for at least one u (u — t — t — T; T > 1). Let (i = 1, 2) be the 
economy with all its 7 agents of type i and suppose that h(<? : ) > h($ 2 ). 
Then SoO^) < 8 0 (^g) (because the quantities a, b, c are the same for 
and <^ 2 ). We shall say that agents of type 2 are “ normal agents of type 1 
are “ deviant .” Consider then economies with I± deviants and 7 8 normals 
(A + 7 8 = 7). Then: 

Result 2. When the proportion of deviants decreases in the economy 
then S 0 increases. 

Proof. The radius 8 0 (&) is proportional to: (7) _ 1 (c _1 — h). The term c~ x 
does not depend on the way 7 is cut in 7 X + 7*. But in Formula ( 6 ) and thus 
(5) we can replace h by (7) _I [7 1 A(<f 1 ) + Thus 8 0 is proportional to: 



210 


g£rard FUCHS 


l- 1 — fi(£ t ) — (I)- 1 1- l [k(6'i) — £(^*)] which increases when ( I)~ l f goes 
to zero (h = Jh, c _1 = Ic~ l ). | 

Result 2 validates the intuition that deviant agents make the local 
dynamics near p less stable. 

(2) Next we shall consider changes in a or b. One has to be very 
careful here because the values of c and h are then necessarily altered 
directly and may also be altered on account of the relations induced by 
(A.4). We have however two results for “small” and “large” values of 
a and b. 


Result 3. If aggregate excess demand is sufficiently sensitive to changes 
in any present price while individual excess demands are but little sensitive 
to the future, then the smaller this sensitivity the larger 8 0 . 


Proof. Suppose a and b are small enough so that, on one hand, their 
product by any 


is small before 


JW< 

Dp 


7®1 


and 


Dp,-i 


(*) 



and, on the other hand, the product of a by any ||(i)^ < /D/>,)(e)|| is small 
before the smaller eigenvalue of 2 t . Then multiplying a and b by A < 1 
leaves h and c roughly unchanged so 8 0 is itself roughly multiplied by 

A- 1 - I 

Result 4. If demand for one non-storable good of at least one 
individual agent becomes very sensitive to the future, then S 0 becomes very 
small or zero. 


Proof. In this opposite situation it is first clear that multiplying a and b 
by A > 1 roughly changes h to A h. But the smaller eigenvalue of 2, will 
stay at a finite value even for large A, so c will have some limit anyhow, 
and 8 0 will go progressively to zero for growing A. | 

Results 3 and 4 both give some precise support the intuition that the 
more the agents are sensitive to the future the more the dynamics are 
sensitive to changes in expectations. 

(3) We now consider the way 8 0 depends on c. Our first remark is 
that our situation is simpler than the previous one because we can vary c 
without a, b and h (and, of course, always p\) being changed; for instance, 



ASYMPTOTIC STABILITY OF EQUILIBRIA 


211 


one can vary the derivative of some z\\ with respect to some pf with k and 
k' > 2. Thus: 

Result 5. If, by changing sensitivity of individual young agents to the 
present price of goods, aggregate excess demand is made less sensitive 
to the present, then becomes smaller or zero. 

Proof. From (5), 8 0 then behaves as c _1 min us a constant. | 

Result 5 is some sort of complement of Result 4: Diminishing sensitivity 
to the present together with leaving unchanged sensitivity to the future is 
another way of giving more importance to the future. 

(4) We now turn our attention to the role of the length T of the 
memory of the agents. T appears through the majoration we used to 
obtain (5) from (7), and also, implicitly, through the definition of h. Let us 
now consider the subspace of E where the expectations of all agents 
are of the form = <p t + dtpi with <p, fixed and such that: 


■Py. 

Dp» 



and dipt satisfies: 

| (*)|| < Poy‘<T u (P. & > 0, 0 < y, y 0 < 1, k = t . t - T). 

We shall call £"(<p<) a space of economies where all agents have "normal 
memories.” On the opposite, we shall say that an agent has an "abnormal 
memory ” if 

Vu(P>0) 

(clearly those two situations do not cover all possible types of agents!). 
We then have from formula (7): 

Result 6. If we consider a space of economies where all agents have 
normal memories of a given length and if we lengthen these memories 
without changing the way expectations depend on the present, then x 0 is 
but little altered. 


Proof. By assumption, we change expectations around <p t in such a way 
that 


Ddift 

Dp t 


-<g) = o, 



212 


GERARD FUCHS 


so that a , b and c are unchanged; then, by definition of a normal memory 
and from (4), s and h go to some limit when T goes to infinity 
(lim s = (1 — y 0 )~ l ); lastly, it is clear that the longer the memory, the 
nearer we are already to the limit. | 

Result 7. It is sufficient that one agent has an abnormal memory for 8 ft 
to become zero as the length of the memory increases. 

Proof. From (5), S 0 is the sum of a positive term which decreases as 
(T + 1) _1 and of a negative term which remains roughly constant as, 
by definition of abnormality, h is itself roughly proportional to T. | 

Result 6 means that the length of the memory has not much influence on 
our local dynamics if the weight of the past is exponentially decreasing. 
At the opposite. Result 7 tells us that if this weight remains constant then 
the longer the memory the less stable are the dynamics (too much attention 
is paid to remote periods). 

(5) Lastly, we study the way 8 0 directly depends on the shape of the 
expectations, the length of the memory being kept fixed. As above we 
shall consider changes in expectations which do not affect their dependance 
on the present so that a, b and c are unaffected (from (A.6) this is possible 
only if T > 1). Then: 

Result 8. The dependance of the expectations on the present price 
being kept fixed, the less the expectations of agents are sensitive to past 
prices, the larger 5 0 . 

Proof. As only h is affected by our changes, this result is obvious from 
Formula (5). | 

In terms of deviant agents (see (1)) or of agents with abnormal memory 
(see (4)) one can also state Result 8 in the form: 

Result 9. When the proportion of deviants or of agents with abnormal 
memories increases in the economy, or when some agents become more 
deviant, or when the memory of some agents skip from normality to 
abnormality, then 8 fl becomes very small or zero. 

Results 8 and 9 are not supported by an obvious intuition; of course, 
one should not pay too much attention to a far away past, but focusing 
only on the most recent data could have been thought Of as having a rather 
destabilizing effect! 

A last question has to be raised at the end of this section: due to the 
fact S 0 is only an estimation of the maximum size 8 of the ball inside which 
expectations can be moved freely, are the previous results also valid for 8 , 
which is really what is of interest for us ? Our answer can only be : the better 



ASYMPTOTIC STABILITY OF EQUILIBRIA 


213 


are our approximations, the better is the correlation between the variations 
of S c and 8, and one can actually build examples where 8 0 and 8 coincide. 
This, in fact, will be one of the situations we shall consider in Section 4. 

4. The Case of a Single Consumption Good 

The results of the two previous sections are particularly simple to 
visualize in the case where there is only one consumption good in the 
economy in addition to money: All the Jacobians we considered are then 
just numbers and they can be handled easily. 

I. Calculation of 8 0 

Let us first simplify our notations: djZ t will denote the derivative of z)i 
with respect to p, 1 (j = 1) or p*A (j = 2) taken at point ( p, p), 8A 
the derivative of 0, 1 with respect to p t x (j = 0),..., p\_ T (j = T) taken at 
point e. Then we have: 

%t = Z ( 0 i z <- + 8 t z > 8 *h) 

x 

= Z 8 2 Z i (V = t — 1. t — T). 

i 

On the other hand from Assumptions (A.4) and (A.6): 

L (0 x z, + S 8 z<) — —Cp) -1 m (8) 

i 

do<f>t + - + 8 A = 1 V/ (9) 

We shall suppose from now on that traders are identical except for their 
expectations., Then: 

a = | 8 t z \ b = 0 
c = I8 x z + 8 t z £ 

i 

A = a Z (I Si h I + - + I 1 ). 

i 

Let us add the two supplementary reasonable assumptions that 8 2 z > 0 
(individual demand of money increases with its expected price) and that 
< 0 (total demand of money decreases when its present price increases). 
A short calculation using (5), (8) and (9) then gives: 

S 0 = [7(r + l)]-» [(p 1 )- 1 + I I VA - I 8A 1)1. (10) 

L j-i i -1 J 



214 


GERARD FUCHS 


Remark 6. In the case of a single consumption good, if in addition 
T = 1, one gets easily convinced that the condition of Formula (3) is not 
only sufficient but also necessary for local asymptotic stability of p. A short 
calculation then shows that with the assumptions above, d 2 z > 0 and 
2 t < 0, and, in addition for instance, 3^, < 0 V/, then 8 0 actually is 
the maximal radius of the ball inside which expectations can be moved 
freely. 

We shall then proceed even one step further and calculate 8 0 in case the 
individual demands derive from a logarithmic utility function. With q y , q i 
the consumptions of the real good of the young and old consumer 
respectively we thus suppose utility u is given by: 

h(?i , qd = log q x q 2 . 

Standard calculations of maximization of u for a present price of money 
equal to p t and an expected price equal to p* (+1 then lead to a demand of 
money at time t equal to 10 : 


z(Pt, Pui) = j~- 
2 Pt 


2 ^’ 


where u ) y , oi 2 are the endowments (in real good) received by each consumer 
when he is young and old respectively (excess demands for the real good 
can be easily calculated from Walras’ law (A.2)). 

The dynamics of the problem are then given by: 


X i[(<"i IPt) — ~ m = 0 (11) 


Supposing that > w 2 (which is assumption (A.12) in [2]) we have the 
S.T.E.: 

_ l(w 1 — Oljj) 

p 2m 

Choosing then initial expectations of the form 


pUi = iiiPt , Pt- 1 ) = Pt — Pi(p t — Pt- 1 ) ( 12 ) 

we get from (10): 

+ <,3) 


10 This expression for z is, of course, only valid in the region of P* where it is positive. 



ASYMPTOTIC STABILITY OF EQUILIBRIA 


215 


II. Sensitivity of S tt 

We now compare Formulas (10) and (13) to the eight results given in the 
previous section. 

(1) One checks easily that S 0 does not depend on I for identical 
agents (Result 1). On the other hand let us consider (10). Suppose T = 2 
and we have two types of agents defined through: 

type 1: 0 o <Ai = 1 8 i<f> i — i 

type 2: 8 0 if> t = 1 8^ = - 8 = 1. 

Then the smaller the number of agents of type 2 (the "deviants” according 
to our terminology of Section 3) the larger is 8 0 : precisely, with I 2 agents 
of type 2 and / — J 2 of type 1 we have: 

S« = [I(T+ 1)] -Hip 1 )" 1 m(8 t z)-^ — / — / 2 ] 


(Result 2). 

(2) If d 2 z, the sensitivity of the agents to the future, goes to zero, 
S 0 goes to infinity (Result 3). Conversely for 8 t z large enough 8 0 equals 
zero (clearly the term in (10) which depends on expectations is negative) 
(Result 4). 

(3) In the case where there is only one consumption good, the effects 
described by Result 5 and Result 4 coincide. 

(4) If the length T of the memory of the agents is increased then: 

in case of “normal” memories as defined in 4, Section 3, then 
clearly the quantity x r , of Formula (7) goes to a nonzero limit if the 
exponential decrease of the derivatives of the i p t is fast enough (Result 6); 

in case of abnormal memories with for instance djip { > 1, then 8 # 
becomes zero for a finite value of T (because surely an increasing number 
of djipi are negative) (Result 7). 

(5) The smaller the dependance of the expectations on the past, i.e., 
the smaller the derivatives o x >p i 8 T ip i , clearly, the larger is 8 0 (Result 8). 
If the memory of even only one agent becomes very “abnormal,” in the 
sense that ! 8 s*l>i I becomes large, then clearly 8 0 goes to zero (Result 9). 

The example from formula (12) shows, for instance, that if all /?< are 
positive except ^ , then 8 0 is positive so long as: 


I ft I < 


i(<a t — q> 8 ) 
2u> 2 


6 42/i3/2-4 



216 


GERARD FUCHS 


Of course the critical abnormality grows proportionally to the number of 
agents. 


Conclusion 

Two types of results are presented in this paper. 

First we give an explicit formula for the calculation of the evaluation <$„. 
Of course, it is clear that some of the approximations we use may appear 
to be rather crude in some definite situations. Our opinion, however, is 
that S 0 correctly describes the order of magnitude of S and that it is 
doubtful that calculations can be significantly improved without intro¬ 
ducing a more specific scheme. On the other hand, the precise numerical 
value that S 0 can take in a given situation cannot be given a deep economic 
meaning. But it surely opens new doors for further models, where, for 
instance, price expectations would not be pointwise distributions or would 
possibly be submitted to accidental perturbations, etc.: then the precise 
value of S 0 gives us an indication of how large can be the uncertainty 
without the long run stability being destroyed. 

Then we give further information about the way S 0 is sensitive to the 
parameters of the economy. If a result like 3 cannot be considered as 
unexpected, some others, and in particular Results 8 and 9, are not 
supported by obvious intuitions. Moreover, the influence of the existence 
of even a single “deviant” or “abnormal” agent clearly appears in light. 
Thus we have criteria for more or less sensitivity to some uncertainty 
which might prove to be economically relevant. 

In addition to introducing definite formulations of uncertainty, an other 
possible line for further work with this model is to study “control” 
problems. An example of such a problem could be the explicit insertion of 
time in price expectations: for instance, the expectations could vary in 
response to external political decisions or some external economic policy, 
etc. The results here then will limit the type of admissible controls that 
can be used to achieve such a goal as stability in the long run. 


References 

1. P. A. Samuelson, An exact consumption loan model of interest with or without the 
social contrivance of money, J. Pollt. Econ. 66 (1958), 467. 

2. G. Fuchs and G. Laroque, Dynamics of temporary equilibre and expectations, 
Econometrica, in press. 

3. G. Fuchs, Structural stability for dynamical economic models, /. Mathematical 
Econ. 2 (1975), 139. 



JOURNAL Of ECONOMIC THEORY 13 , 217-228 (1976) 


Manipulation of Social Choice Functions 

Peter Gardenfors 

Department of Philosophy, University of Lund, Lund, Sweden and 
Department of Philosophy of Science, Umed University, Umei, Sweden 

Received October 22, 1975 


1. Introduction 

It is well known that for most group decision functions used in practice, 
it is possible to manipulate the outcome of the function in the sense that 
an individual (or a group of individuals), by misrepresenting his prefer¬ 
ences, secures an outcome he prefers to the outcome of the function which 
would have obtained if he had expressed his sincere preferences. However, 
it is only recently that the possibilities of finding group decision functions 
which are not manipulable have been investigated. As expected, it is found 
that most functions are manipulable and those which are not must be 
rejected for other reasons. 

As an introductory example, take the method where each individual 
has one vote and where the alternative which gets the greatest number of 
votes wins. Suppose we have a voting situation where there are two 
alternatives x and y, each of which is considered best by 45 % of the voters, 
and an alternative z, which is considered best by the remaining 10%. 
Now it may.be possible for those who prefer z to manipulate the outcome 
of the voting by misrepresenting their preferences. It is clear that one of 
the alternatives x and y will win, so, by voting for the best of these instead 
of voting for z, it is possible to manipulate the outcome in a favorable 
direction. 

This example shows that manipulability is not necessarily something 
to be avoided. Those who would vote for z, if voting sincerely, conclude 
that such a vote would have no effect on the outcome and the only way 
for them to influence the outcome is to misrepresent their preferences. 
However, a group decision method which is not manipulable is appealing, 
since it makes needless all strategic considerations, and thus makes voti% 
simple from a game-theoretical point of view. 

Recent research on the manipulability of voting methods has mainly 
been devoted to social choice functions, which, in all decision situations, 

217 

Ccp>fight © 1976 by Academic Press, Inc. 

All right* of reproduction in any form reserved. 



218 


PETER GARDENFORS 


select a single alternative as the winning alternative. We call such functions 
resolute social choice functions. It has been shown independently by 
Gibbard [5] and Satterthwaite [8] that all such functions with at least 
three possible outcomes are either dictatorial or subject to individual 
manipulation. This result is, however, dependent on the assumption that 
the outcome of a resolute social choice function consists of a single 
alternative. 

In this paper we will mainly discuss a more general class of social choice 
functions which do not necessarily select only one alternative in all 
situations, but merely a (nonempty) subset of the set of alternatives. 
In connection with manipulability, this kind of voting methods has been 
studied by Pattanaik (see, e.g., [6,7]). 

For social choice functions in general, in contrast to resolute functions, 
it is not a trivial problem to define manipulatiliby, since one has to 
compare outcomes consisting of several alternatives, when the only 
information available is individual preference orderings of single 
alternatives. We shall formulate some conditions which are sufficient for 
manipulability, but which are weak in comparison to some other attempts. 
Using this concept of manipulability, we then show that most democratic 
social choice functions, among them all majority functions, are manip- 
ulable. This result is, however, dependent on the fact that individual 
preference orderings are allowed to contain ties. Some examples show 
that if individual preferences are restricted to linear preference orderings, 
then there are non-trivial functions which are not manipulable. These 
functions are unfortunately very undecisive in most situations. 


2. Notation 

In this section, we introduce the notation and the formal frame for the 
group decision functions. 

The set of alternatives is denoted A and is assumed to contain m 
elements, m > 3. Single alternatives are denoted x, y, z,..., and nonempty 
subsets of A are denoted X, Y, Z,.... The set of voters is denoted V and is 
assumed to contain n elements, n > 1. Single voters are denoted 1, 2, .., n, 
and as variables we use 1, j, and k. 

A binary relation R is called a weak ordering if it is transitive and 
connected. Weak orderings will be used to represent preference orderings 
of the alternatives. If i is a voter, R t will denote his preference ordering. 
For every preference ordering R, we define the strict preference relation P 
and the indifference relation I in the usual way, i.e., xPy iff xRy and not 
yRx\ and xly iff xRy and yRx. A weak ordering is called linear iff for all 



MANIPULATION OF SOCIAL CHOICE FUNCTIONS 


219 


alternatives x and y,ifx^ y, then either xPy or yPx. R denotes the set of 
all weak orderings of A. Similarly, P denotes the set of all linear orderings 
of A. A situation is an element in R", and will be denoted (R,, R t R n ). 
We will use a, b, c,... to denote particular situations. R ia will denote 
voter i’s preference ordering in the situation a. Situations will be described 
in the following self-explanatory manner: 

a: 1. yxz 

1. x(yz) 

3. C xyz ) 

Preference is indicated by position where the ordering is from left to right, 
except for elements enclosed in parentheses, which are ties. 

A social choice function (SCF) is a function F: R" 2 A — <f>, where 
2 A denotes the set of all subsets of A. A resolute social choice function 
(RSCF) is a function F: R" A. Hence, a social choice function selects 
a nonempty subset of A in each situation, while a resolute social choice 
function selects only one alternative. If we identify the one element subsets 
of A with the elements, we see that every RSCF is a SCF, but the converse 
is not true. 


3. Manipulation of Resolute Social Choice Functions 

For resolute social choice functions the following definition of 
manipulability is the most natural: 

Definition 1. A resolute social choice function F is manipulable by i at 

( R x ,..., R ( ,..., R n ) iff there is an ordering Rf, such that F(Rj. R/ .jR„) 

Ri ,..., R n ). We say that F is non-manipulable or stable iff F is 
nowhere manipulable. 

We will now state the Gibbard-Satterthwaite theorem for resolute 
social choice functions. 

Definition 2. A resolute social choice function F is dictatorial iff 
there is a voter i such that, for every situation a and every alternative y 
in the range of F, F( a) R t y. The voter'! is called a dictator for F. 

Theorem I (Gibbard-Satterthwaite). If F is a resolute social choice 
function which is stable, and if the range ofF contains at least three elements, 
then F is dictatorial. 

Theorem I may give the impression that any search for a reasonable 



220 


PETER GARDEN FORS 


stable decision function is a hopeless enterprise. The proof of the theorem 
is, however, crucially dependent on the assumption that the function is a 
resolute social choice function. For such functions, ties between two or 
several alternatives are never allowed as the outcome of the decision 
function. This is a rather restrictive and unnatural assumption, since, for 
many democratic group decision methods, there are situations where the 
outcome is a tie which is then broken by some chance procedure to obtain 
the winning alternative. We therefore turn our attention to the entire class 
social choice functions. 


4. A Generalized Definition of Manipulability 

In order to be able to define manipulability, one needs a criterion on 
how the voters value the outcomes of the choice function. Since the 
outcomes of resolute social choice functions are single alternatives, 
one obtains, for each voter, a valuation of these outcomes directly from 
his individual preference ordering. If we consider social choice functions in 
general, matters become more complicated, since manipulability has to be 
defined from the voter’s valuations of different subsets of A, and the only 
information available is the preference orderings which are orderings of 
the elements of A. 

Given an individual preference ordering of A, we shall now define 
a partial ordering >* of the nonempty subsets of A. The main idea when 
judging a subset as definitely better than another subset is a sure-thing 
principle; if some alternative has been added, it should be at least as good 
as all the other alternatives, and if some alternative has been deleted, it 
should be worse than the remaining alternatives. 

Definition 3. Let R be a weak ordernig of A. Let X and Y be non¬ 
empty subsets of A. Then X > Y iff one of the following conditions is 
satisfied: 

(i) X C Y, and for all x e X and y e Y — X, xRy, and there exist 
x e X and y e Y — X such that xPy. 

(ii) YC X, and for all x e X — T and y e Y, xRy, and there exist 
reL - 7 and y e Y such that xPy. 

(iii) Neither X C Y nor YC X nor X — Y, and for all xe X — Y 
and y e Y — xRy, and there exist xe X — Y and ye ¥ — X such that 
xPy. 

Example 1. Let A consist of x, y and z, and let R be the preference 



MANIPULATION OF SOCIAL CHOICE FUNCTIONS 


221 


ordering determined by xPy and yPz. It is then easy to check that 
{X,y}>{y}, {*} > {x, y), {x,y)>{x,z}, and {x,y} > {y, z). It is, 
however, not true that {x, z} > {>-}• 

Example 2. Let A be as above, and let R be the preference ordering 
determined by xPy, xPz and ylz. Then {x} > {x, y}, (x, y} > {y, z}, and 
jx, y) > {x, y, z}, but not {y} > {y, z>. 

The fact that {x, y} > {x, y, z} in the example above may appear 
somewhat unmotivated. However, if the final choice from an outcome 
consisting of several alternatives is made by a random mechanism which 
assigns equal probability to all alternatives in the choice set, then {x, y} 
will be a better outcome than {x, y, z), if one’s preferences are as in the 
example above, since the best alternative x has a greater chance of winning 
in the first outcome than in the second. 

We need not confine ourselves to this interpretation of the outcomes 
when we show that most social choice functions are manipulable. The 
cases we need in the proofs will all be obvious examples of manipulation. 

It is possible to show that, for every weak ordering /?, the corresponding 
relation > is a partial ordering, i.e., transitive and irreflexive. The proof 
presents no difficulties but is rather tedious, so we omit it. 

Definition 4. A social choice function F is manipulable by i at 

(/?!. R„) iff there is an ordering R/ such that F(R ^,..., R n ) >, 

F(R ,,..., R ,,..., R n ), where >, is the ordering derived from R t . F is 
non-manipulahle or stable iff F is nowhere manipulable. 

This definition obviously reduces to Definition 1 when restricted to 
resolute social choice functions. 

We next give two simple examples of stable social choice functions. 

Example 3. Let F t be the social choice function which is defined by 
F(a) — A, for all situations a. It is an immediate consequence of the 
definitions that F x is stable. 

Example 4 (Gardner [4]). Let F t be the social choice function which 
is defined in the following way: x e F t {RiR„) iff there is some R t such 
that for all y, xR { y. F 2 (a) thus consists of all alternatives which are top- 
ranked in some voter's preference ordering in a. It is easy to verify that F s 
is stable. 

This example shows that Theorem 1 cannot be extended to social choice 
functions in general, since the range of F 2 is A and F t is non-dictatorial. 

In Definition 4 we have chosen a weak concept of manipulability such 
that there is no doubt that i prefers the outcome X to the outcome Y, 



222 


PETER GARDENFORS 


if X >i Y. It will therefore be a strong result if one is able to show that all 
functions in a certain class of social choice functions are manipulate. On 
the other hand, showing that a particular function is stable in the sense 
of Definition 4 is a comparatively weak result, since, firstly, it is possible 
that a stronger concept of manipulability is more natural and, secondly, 
showing that a certain function is not manipulate by a single individual 
does not in general imply that the function is not manipulate by a group 
of individuals. We will return to this topic in connection with Theorem 4. 

In [6] and [7], Pattanaik uses a maximin relation to define an ordering 
of the subsets of A. He obtains a connected ordering of the subsets, and 
thus every two subsets of A are comparable with respect to manipulability. 
Pattanaik’s concept of manipulability is therefore stronger and seems less 
natural than ours. 


5. Manipulation of Social Choice Functions 

In this section we will show that, according to our definition of 
manipulability, most democratic social choice functions are manipulable. 
The following two conditions are basic for such functions. 

Definition 5. A social choice function F is anonymous iff whenever 
two situations a and b are identical except that R ia — R jb and R ia — R >h , 
for some voters i and j, then F(a) — F(b). 

Definition 6. A social choice function F is neutral iff whenever two 
situations a and b are identical except that x and y have changed places 
everywhere, then x e F(a) iff y e F(b) and y e F(a) iff x e F(b). 

In simple words, a social choice function is anonymous if it treats every 
voter in the same way, and neutral if it treats every alternative in the same 
way. 

Theorem 2. Let F be a social choice function which is defined for three 
alternatives and three voters. Let a be the following situation: 

a: 1. zyx 

2. xyz 

3. xzy. 

If F is anonymous and neutral , and if F(a) = {x}, then F is manipulable. 

Proof. Besides a, consider the following situations: 



MANIPULATION OF SOCIAL CHOICE FUNCTIONS 


223 


b: 1. zyx c: 1. zyx 

2. (xy)z 2. yxz 

3. xzy. 3. xzy. 

Situation c is a variant of the so called “voting paradox.” By the 
assumption that F is anonymous and neutral, one can show that F(c) — 
{x,y, z} (cf. [3, pp. 11-12]). We will now show that whatever nonempty 
subset of A we choose as F(b), F will be manipulable. We consider four 
cases. 

(i) z e F( b). In this case, it is easy to check that F is manipulable 
by 2 at the situation b, since, for any set Z such that zb Z, we have 
F(a) > si Z (“> 2() ” is the ordering of the subsets of A which corresponds 
to R tb ). 

(ii) F( b) = {x, y). In this case, F will be manipulable by voter 2 at 
the situation c, since {*, y] > tc {x, y, z}. 

(iii) F(b) = {x}. Consider the following situation: 

d: 1. y z x 

2. ( xy)z 

3. xzy. 

If z eF( d), then F is manipulable by 1 at b, as is easily verified. If z$F( d), 
then since F is anonymous and neutral and x and y have symmetrical 
positions in d, we conclude that F(d) = {x, >>}. But then F is manipulable 
by 1 at b since {x, y) > u {x}. 

(iv) F(b) = { y}. Consider the following situation: 

e: 1. zyx 

2. (xy)z 

3. zxy. 

If F(e) = {z}, then F is manipulable by 3 at b, since {z} > ib {y}. If F(e) # 
{z}, then x e F(e) or y e F(e). Now consider the following situation: 

f: 1. zyx 

2. xyz 

3. zxy. 

This situation can be obtained from a by permuting x and z and inter- 



224 


PETER GARDENFORS 


changing l’s and 2’s preference orderings. Since F( a) = {*} and F is 
anonymous and neutral, we conclude that F(f) = {z}. But then F is 
manipulate by 2 at f since for any nonempty subset X of A which contains 
x or y,X > 2/ {z}. 

These four cases exhaust all possible ways to choose F( b), and we have 
shown that F is manipulate in all cases. This proves the theorem. 

The assumption that F is defined for three alternatives and three voters 
only is introduced in order to simplify the proof. We next show that the 
theorem can be extended to cover most social choice functions used in 
practice. 

Definition 7. A social choice function F satisfies the Concorcel 
criterion iff whenever there is an alternative x in a situation a such that, 
for every alternative y f= x, the number of individuals who strictly prefer 
x to y is greater than the number of individuals who strictly prefer y ot x, 
then F( a) = {x}. Such an alternative is called a majority alternative in the 
situation a. 

Theorem 3. Let F be a social choice function which is defined for at 
least three voters. If F is anonymous, neutral and satisfies the Condorcel 
criterion, then F is manipulable. 

Proof. If A and V both contain three elements, then the theorem 
follows immediately from Theorem 2, since any function which satisfies 
the Condorcet criterion selects {*} in the situation a. If A contains more 
than three alternatives, then the situations in the proof of Theorem 2 
may be augmented with dummy alternatives which are ranked after x, y, 
and z, in some fixed ordering in every preference ordering. Similarly, if V 
contains more than three voters, these situations may be augmented with 
dummy individuals who all are indifferent between x, y, and z. The 
arguments of the proof of Theorem 2 are not affected by these additions, 
as is easily checked. This completes the proof of the theorem. 

In the literature there occur several types of social choice functions where 
the outcomes are determined from the sums of points assigned to the 
different positions in the preference orderings. Functions of this kind have 
been called summation social choice functions by Fishburn [2], represent¬ 
able functions by Gardenfors [3], point systems by Smith [9], and social 
choice scoring functions by Young [10]. As soon as the first position in a 
preference ordering is assigned the greatest number, and the corresponding 
function is neutral, such a function will select {*} as the choice set in the 
situation a in Theorem 2, which is what is needed to conclude that the 
function is manipulable. 



MANIPULATION OF SOCIAL CHOICE FUNCTIONS 


225 


6. MANIPULABILITY WHEN VOTERS’ PREFERENCES ARE LINEAR 

Theorem 2 does not leave much room for useful stable social choice 
functions. However, the proof of the theorem exploits the fact (in 
situation b) that the voters are allowed to have ties in their preference 
orderings. The following example will show that if the domain of a social 
choice function is restricted to situations containing only linear orderings, 
i.e., situations in P", then there are interesting stable functions. 

Example 5. Let F g be the social choice function which is defined for 
situations in P n in the following way: F g (a) — {x}, if there is a majority 
alternative x in the situation a, and F g (a) = A otherwise. 

Theorem 4. F a is stable, anonymous, neutral, and satisfies the Condorcet 
criterion. 

Proof. We show that F 3 is stable. The remaining properties are imme¬ 
diate consequences of the definition of the function. Suppose there exists 
a situation (P y P n ), a voter i, and a preference ordering P/, such that 
Ff.Pi Pt'f-t P «) FfP i >•••> P t >•••) Pn)- Let a = (Pj,..., P 4 PJ 

and a' — ( P x ,..., P,PJ. We divide the proof into three cases. 

(i) F s (a') = {x} and F s (a) = (y), for some alternatives x and y. 
Since F a (a') > ( F a (a) we conclude that xP f y. Since F g ( a) = {y}, y is a 
majority alternative in a, and since xP,y, x can never become a majority 
alternative in a', no matter how P,' is chosen. Hence this case is impossible. 

(ii) F 8 (a') = {x} and F a (a) — A, for some alternative x. Since 
P 3 (a') F a ( a), it follows, from the definition of >, and the assumption 
that all preference orderings are linear, that xP, y for all alternatives y^r. 
If x is not a majority alternative in a, it is not a majority alternative in a', 
no matter how P- is chosen. Hence this case is impossible too. 

(iii) F a (a') = A and F 3 (a) = {x}, for some alternative x. Since 
Ff a') > { F 3 (a), it follows in the same way as in case (ii) that yP,x for all 
alternatives y ¥= x. So, if x is a majority alternative in a, it will be a 
majority alternative in a', no matter how P/ is chosen. This shows that 
also the third case is impossible. 

These three cases are the only possible ones, if it is assumed that F„ is 
manipulable, according to the definitions of F a and > ( . We have thus 
shown that F s is stable and the proof is complete. 

As we remarked earlier, this kind of theorem is rather weak since we use 
a concept of manipulability which includes as little as possible, and since 
we only allow one individual to misrepresent his preferences. Theorem 4 



226 


PETER GARDENFORS 


can be strengthened, however, since it is possible to show that not even a 
group of individuals can manipulate the outcome of F a . Here, we define 
a social choice function F to be manipulable by a group J of individuals, 
JQV, at a situation a iff there is a situation b where the individuals in J 
have misrepresented their preferences such that, for ail individuals j in J, 
m >j F( a). The proof of the fact that F s is stable under manipulation by 
groups runs along the same lines as the proof of Theorem 4, changing the 
statements about the individual preference ordering to statements about 
the orderings of all individuals in the group. 

The function F t is completely undecisive in situations where there are 
no majority alternatives, and thus not suited for practical use. We can 
construct a somewhat more decisive function in the following way. 

Example 6. We say that an alternative x is Pareto dominated in the 
situation ,..., R„) iff there exists an alternative y such that yPiX for all i. 
The set of all Pareto dominated alternatives in a situation a is denoted 
pd(tt). We now define a social choice function F 4 by F 4 (a) = F s (a) — pd(a), 
for all situations in P". As is easily checked, F 4 is anonymous, neutral, 
weakly Pareto-optimal, and satisfies the Condorcet criterion. It can also be 
shown that F t is stable. The proof, which follows the same lines as the proof 
of Theorem 4, reduces to a number of subcases. Each of these cases is 
rather simple, but the proof still becomes long winded, so we omit it. 

F t is still very undecisive, but we have not been able to find any more 
decisive function which is stable and satisfies minimal requirements on 
democratic decision functions. Our conjecture is that all such functions 
are too undecisive to be of practical interest. 


7. Conclusions 

This paper has shown that when defining and investigating manip- 
ulability of group decision processes there are several factors which have 
to be taken into account. 

Firstly, the type of the outcome of the group decision function is 
important. We have here studied two types of social choice function, 
where the resolute functions form a subclass of the more general class. 
Defining manipulability for resolute functions is straightforward, while, 
for social choice functions in general, there are several possible ways to 
draw the line between what is manipulation and what is not. Here, we have 
chosen a definition of manipulability based on a sure-thing principle. 

Another type of group decision functions, which is not dealt with in this 
paper, is “social welfare functions” as defined by Arrow [1], The outcome 



MANIPULATION OF SOCIAL CHOICE FUNCTIONS 


227 


of a social welfare function in a decision situation is an ordering of the 
alternatives instead of a choice of a subset of them. Gardner [4] has 
introduced a concept of manipulability for this kind of functions based on 
measures of the degree of similarity between the preference ordering which 
is the outcome of the function in a given situation and the preference 
ordering which expresses the sincere tastes of a voter in that situation. 

Secondly, the kind of orderings used to represent voters' preferences 
are relevant when determining which decision methods are manipulate. 
We have shown that if ties are allowed in the voters’ orderings, then all 
democratic social choice functions which satisfy the Condorcet criterion 
are manipulable. However, if ties are not allowed, i.e., if all preference 
orderings are linear, then there exist democratic functions which satisfy 
the Condorcet criterion and are stable. 

Taking together the results in this and other recent papers on 
manipulation of group decision processes, one finds that it is impossible 
to find a decision method which is democratic, decisive, and stable. If a 
function is decisive in the extreme sense that it selects only one alternative 
in every situation, then the Gibbard-Satterthwaite theorem shows that 
either a function is dictatorial or excludes most alternatives from ever 
being chosen (which are non-democratic properties), or the function is 
manipulable. If a function is democratic, in the sense of being anonymous 
and neutral, and decisive, e.g., in the sense that it satisfies the Condorcet 
criterion, then Theorem 3 shows that the function is manipulable. Further 
support for the general conclusion can be obtained from the fact that the 
examples of democratic and stable social choice functions we have been 
able to construct are all very undecisive. 


References 


1. K. J. Arrow, "Social Choice and Individual Values,” 2nd ed., Wiley, New York, 
1963. 

2. P. C. Fishburn, “The Theory of Social Choice,” Princeton Univ. Press, Princeton, 
1973. 

3. P. GArdenfors, Positionalist voting functions, Theory and Decision 4 (1973), 1-24. 

4. R. Gardner, Some implications of the Gibbard-Satterthwaite theorem, mimeo¬ 
graphed, 1974. 

5. A. Gibbard, Manipulation of voting schemes: A general result, Econometrica 41 
(1973), 587-601. 

6. P, K. Pattanaik, On the stability of sincere voting situations, J. Econ. Theory 6 
(1973), 558-574. 

7. P, K. Pattanaik, Stability of sincere voting under some classes of non-binary group 
decision procedures, J. Econ, Theory 8 (1974), 206-224. 



228 


PETER GARDENFORS 


8. M. Satterthwaite, Strategy-proofness and Arrow’s conditions: Existence and 
correspondence theorems for voting procedures and social welfare functions, J, 
Econ. Theory 10 (1975), 187-217. 

9. J. H. Smith, Aggregation of preferences with variable electorate, Econometrica 41 
(1973), pp. 1027-1041. 

10. H. P. Young, Social choice scoring functions, mimeographed, 1973. 



JOURNAL OF ECONOMIC THEORY 13 , 229-244 ( 1976 ) 


Indexation in a Rational Expectations Model* 

Robert J. Barro 

Department of Economics, University of Rochester, Rochester, New York 75242 
Received December 8,1975 


I. Introduction 

Indexation has been proposed as a means of insulating the real economy 
from monetary distrubances. In particular, it has been suggested that 
widespread indexation of wages and prices would eliminate or moderate 
the short-run tradeoff between output and unanticipated inflation, as 
described in modern theories of the Phillips curve. For example, 
Friedman [8, p. 43] says: 

... indexation will shorten the time it takes for a reduction in the rate of 
growth of total spending to have its full effect in reducing the rate of inflation. 

As the deceleration of demand pinches at various points in the economy, any 
effects on prices will be transmitted promptly to wage contracts, contracts for 
future delivery, and interest rates on outstanding long-term loans. Accordingly, 
producers’ wage costs and other costs will go up less rapidly than they would 
without indexation. This tempering of costs, in turn, will encourage employers 
to keep more people on the payroll and to produce more goods than they would 
without indexation. The encouragement of supply, in turn, will work against 
price increases, with additional moderating feedback on wages and other 
costs. 

With widespread indexation, in sum, firm monetary restraint ... would be re¬ 
flected in a much more even reduction in the pace of inflation and a much smaller 
transitory rise in umemployment. (italics added) 


Fischer [5] argues along similar lines, except to stress a distinction 
between nominal and real disturbances: 

In general, we expect the economy to adjust more rapidly to monetary policies 
than in the present situation as indexation of wages, prices, and rates of return 
has introduced greater flexibility into the economy. As to the much harder 
problem of the division of changes into price and quantity adjustments, it 


* I am grateful for some useful comments by John Taylor. 

This research was supported by the U.S. Department of Labor (ASPER). 

229 

Copyright © 1976 by Academic Preaa, Inc. 

All righa of reproduction in any form reserved. 



230 


ROBERT J. BARRO 


seems likely that more of the variations would be taken up by the former, since 
quantity adjustments occur mainly due to the types of inflexibility which are 
reduced by general indexation. 

The adjustment of a fully indexed economy to real changes is a very serious 
and difficult question about which not much is known. What is likely to come 
out of a full analysis, however, is that an indexed economy is more unstable 
with respect to real changes than a non-indexed one, with the converse holding 
with respect to monetary changes. 

Gray [10] has constructed a model in which indexation has effects for 
the cases of real and nominal disturbances that are along the lines 
conjectured by Fischer. 1 Her model also provides a framework for 
addressing the question of the optimal degree of indexing (for nominal 
wage rates in terms of an aggregate price index). The drawback of her 
model is that the results hinge on nominal wage rigidity in combination 
with a specified rule for determining employment and output in “non- 
market clearing” situations. Wage rigidity has been rationalized from a 
long-term contracting perspective in Baily [2], Azariadis [1], and 
Gordon [9]. However, if these contracts also contain optimal contingency 
plans for the determination of employment in various states of the world— 
as, for example, in Azariadis model—then it is no longer clear that wage 
rigidity would be associated with a Phillips curve. In this situation, the 
introduction of wage flexibility through indexing would also have no effect 
on the Phillips curve. I have presented these arguments in Barro [4], For 
present purposes, I refer to them only to account for my omission of 
long-term contracting and wage-price rigidities. 

In general, in order to analyze the implications of indexing for the 
Phillips curve, it is first necessary to have a model that generates such a 
curve. In the present paper I deal with a model that is in the spirit of 
earlier work by Friedman [7], Phelps [13], and Lucas [11, 12]. In these 
types of models there is a short-run effect on output of ufianticipated 
monetary expansion (hence, unanticipated price movements) because of 
specified limitations on information that permit short-run confusions 
between aggregate and relative shocks. In particular, the information 
structure specifies that individuals perceive local prices faster than they 
perceive prices in general. In this type of model, unanticipated monetary 
disturbances can have short-run real effects, because individuals incorrectly 
attribute part of the observed movements in local prices to shifts in relative 
excess demands, rather than to general monetary shifts. On the other hand, 
prices move instantaneously to clear local markets in this setup, and the 
Phillips curve does not reflect any “non-market-clearing” phenomena. 

1 An essentially similar analysis was subsequently carried out by Fischer [6]. 



RATIONAL EXPECTATIONS MODEL 


231 


Indexation would operate in this type of model by producing some ex post 
adjustment of local prices, in accord with global disturbances. A key issue 
is, then, the impact of this type of ex post price correction on the manner 
in which outputs and prices react to unanticipated monetary movements. 

The main results of the model can be summarized as follows. First, 
indexation has no effect on (the entire probability distribution of) output, 
and therefore no effect on the Phillips curve-type relation between money 
and output. This result obtains because the money/output relation in this 
model derives from a specified type of incomplete information, and 
indexation has no effect on the information structure of the model. 
Indexation would be expected to “improve” the Phillips curve tradeoff 
only if it facilitated the flow of information, and there seems to be no reason 
to expect that type of effect. Second, there are two types of effects of 
indexation on price distributions. The dispersion of relative prices 
(across markets) is reduced by indexation, but the prediction variance 
for future prices (both aggregate and individual market prices) is increased. 
These effects will be discussed and interpreted in the text of the paper. 


II. Setup of the Model 

The formal setup of the model follows my earlier setup in [3], except 
for the introduction of indexation on prices, as detailed below. The 
discussion of the earlier paper is brief, and some familiarity with that 
paper would facilitate understanding of the present analysis. As before, 
there are n markets, indexed by z = 1,..., n. It is again convenient to 
deal with quantity and price variables in logarithmic terms—for example, 
y,(z) denotes the logarithm of the quantity of commodities transacted 
at date t in .market z. The clearing of the zth market is accomplished by 
determining the (log of) price, PAz), to equate commodity supply and 
demand, that is, to obtain y ( '(z) = y ( d (z). The supply and demand 
functions depend on P f (z), on expectations about future prices, and on a 
wealth variable that includes a real money balance term. Money balances 
are the only store of value in the model, and new money enters the economy 
as lump-sum transfer payments from the government. The formation of 
future price expectations is based on the current local information set, 
which includes an observation of the current local price, PAz), 
but which includes global information (for example, on the “aggregate” 
price level and money supply) only with a one-period lag. The fundamental 
information problem for individuals amounts to getting best estimates of 
the current global Situation from current local information and lagged 
global information. More specifically, the earlier model assumed three 


6 42/i3/2-S 



232 


ROBERT ]. BARRO 


types of stochastic (independent, zero mean) shocks; (1) an aggregate 
money shock, m t ; (2) a real shock to aggregate excess demand, v,; and 
(3) a (real) shock to relative excess demand, « ( (z). Current local 
information, as embodied in P t (z), amounts to the observation of a 
weighted combination of these three shocks. Individuals utilize this current 
information to obtain conditional expectations on the three underlying 
components— Em t \ I t (z), Ev t | l t {z), and Ee t (z) | /.(r)—and these condi¬ 
tional expectations are used to form futureprice expectations— EP t+1 1 /,(z), 
where P, +1 refers to the geometric, unweighted average of prices across all 
of the markets. 2 

If the aggregate shocks, m t and v t , were fully observable currently, 
instead of having to be imperfectly estimated from P t (z), that information 
would, of course, alter (and “improve”) the manner in which P,(z) and 
y t (z) were determined. Indexation would not seem to make this type of 
aggregate information available more quickly. Rather, the general idea of 
indexing seems to involve an ex post adjustment of terms-of-trade in 
accordance with (aggregate) information that becomes available only at a 
later date. Suppose, then, that P t (z) represents the price “called out” 
at date t in market z, but that the final terms-of-trade associated with the 
commodity transaction y t (z ) will be determined later (at date t + 1) when 
the aggregate shocks, m t and v, , are observed. For example, a simple 
indexing rule® would prescribe the ex post price, P,(z), to be determined at 
date t 4 - 1 (when m t and v, are observed) in accordance with 

P,(z) = P t (z) + a x m, + a 2 v ,. (1) 

If (a x , a 2 ) > 0, Eq. (1) indicates an ex post adjustment of price in the 
direction of the (unanticipated) monetary and real aggregate shocks. The 
indexing rule of Eq. (1) could be rewritten in terms of the aggregate price 
level and money stock, or in terms of the aggregate price level 2nd output. 
This rewriting would not alter any substantive results, but it is convenient 
for present purposes to express the indexing rule directly as a function of 
the disturbance terms, m t and v t . 

The general idea for the structure of the market process under a uniform, 
general indexing rule is then as follows. Individuals in market z observe 
the called-out price, P,(z). Given their knowledge of the indexing rule, as 

* Because the model is set up so that EP,+ x (z') | I£z) is independent of the market 
index z\ it is unnecessary to relate the mean of future price to a particular market. 
The model could be.extended to involve expectations of prices at dates further in the 
future than / + 1, but that extension does not seem important for present purposes. 

* I do not deal here with the question of individual incentives to introduce indexing. 
However, the results of the present analysis would be an input into that analysis. 



RATIONAL EXPECTATIONS MODEL 


233 


expressed in Eq. (1), individuals form anticipations about P^z), which is 
the corrected price for commodities that individuals in market r will 
actually pay or receive. 4 Current supply and demand, ,y t ‘{z) and yfiz), 
depend on anticipations about P t (z) in relation to anticipations about prices 
that could be obtained further in the future, in particular, the average 
adjusted price for next period, P t +\. The called-out price, P t (z), is then 
assumed to adjust (thereby implying adjustments in anticipations about 
P t (z) and P, +1 ) to equate y t ‘(z) to y, d (z). This market-clearing process 
determines the quantity of commodities transacted, that is, current 
“output,” y t (z). 

At date t + 1, the corrected price, P t (z), is determined from P,(z) and 
the observations on m t and v t , which determines the final payments 
associated with the quantity transacted, y t (z). However, a crucial point is 
that y,(z) has, itself, already been determined at date /, and is therefore 
unaffected by the actual amount of index adjustment that occurs at date 
i + 1 (as distinguished from the amount expected to occur from the 
viewpoint of date t). Since quantity decisions (amounts of production, 
employment, etc.) must be made at date f, it is not possible for (ex post) 
indexation to have direct output effects. If there are effects of indexation 
on output, they must operate indirectly, through the knowledge that 
indexing will occur later, and through effects of this knowledge on current 
supply and demand decisions. 


III. Determination of Prices and Outputs 

The form of the supply and demand functions in the present analysis 
corresponds to my earlier model except for the indexing effect. The supply 
function is assumed to be the log-linear expression,® 

y t '(z) = «,lEP t (z) | I t (z) - EP I+1 | l,(z)) - - EP, +1 ] /,(z)] 

+ m,' + € ( *(Z). (2) 

The difference between Eq. (2) and the earlier treatment is the replacement 

* In this model it does not matter whether P,U) is paid at date t and the adjustment, 
Pi(r) — P t (z), occurs at date / + 1, or the full price, P,(z), is paid at date t + 1. If 
assets paid a nonzero real rate of return, these payment procedures would have to be 
distinguished. 

s The portion of the wealth term in the earlier analysis corresponding to expected 
future transfers from the government, EAM, +l — E(M, +1 — Af t ), has been omitted, 
since this term is always zero in the present analysis. Any systematic effects on output 
supply and demand have also been left out. 



234 


ROBERT J. BARRO 


in the a, term of the caUed-out price, P t (z), by the expected ex post indexing 
price, conditioned on current information, EP^z) | l t (z). As in the case 
of anticipations about future prices, it is assumed that only the mean of 
P t (z) | l t (z) influences commodity supply. Discrepancies between EP,(z) 
and EP l+l induce a current supply response in Eq. (2), as measured by the 
elasticity a,. The remaining terms in Eq. (2) are not new and can be 
summarized briefly; the j8, term is a wealth variable in which it is assumed 
implicitly that the money owned by participants of market z is always 
the same fraction of the aggregate money stock M t ; «<* is an aggregate 
shift to supply (equiproportional across markets); and e,’(z) is a shift to 
relative supply in market z. 

Similarly, the expression for commodity demand is 

y, a (z) = -<x d [EP t {z) 1 I t (z) - EP t+1 1 / t (z)] 

+ pt[M t - £P t+l | I t (z)} + u* + e/(z). (3) 

Market-clearing conditions will depend on excess demand measures, as 
defined by 

u t u t d — u t \ 

e t (z) ( t d (z) - e,‘(z). 

As before, the real aggregate excess demand shift, u t , is generated by a 
random walk, 

u t = u,_ x + v t , 


where v t , generated by a white noise process with variance o v 2 , represents 
the innovation to aggregate excess demand. The relative excess demand 
shift, €j(z), is generated by a white noise process with variance a 2 . The 
money supply is generated by a random walk, 

M t = \f t -i -b m, , 

where m, is generated by a white noise process with variance o m 2 . Adding 
systematic or feedback terms to the money supply process, as I did in my 
earlier analysis, would not affect the present discussion. Finally, defining 
a = a, + and /S = jS, -1- f3 d , the market clearing condition corre¬ 
sponding to an equality between y t ’(z) and y d {z) can be derived as 

*EP t (z) | 7,(z) =(ol-P) EP t+l | /|(z) + pM t +«, + £|(z). (4) 

'i- ■ 

The indexed price, P t (z), is related to the called out price, f\(z), in 
accordance with the indexing rule of Equation (1). Substituting this rule 



RATIONAL EXPECTATIONS MODEL 


235 


into Equation (4) yields the clearing condition in terms of called out prices, 

ol[P ( {z) + a x Em, | I,{z) + citEvt | I t (z)} 

= (« - P) EPt+i I I<(z) + + u, + e,(z), (5) 

where EP i+1 \ I,(z) = EP t+1 \ I t (z) since Em t+l \ J t (z) = Ev t+1 \ I,(z) — 0. 
Recall that a x and a t are the two indexing parameters from Equation (1). 

The model can be solved, in the sense of determining prices and outputs 
as functions of exogenous variables, by the same procedure that I outlined 
in my previous paper. I will present only a sketch of that procedure here. 
The called out price will end up as a log-linear function of the following 
variables, 

P t (z) = ( z ) + "W-i, (6) 

where the ir’s are a set of unknown coefficients to be determined by use of 
the market clearing condition of Eq. (5). Using Eq. (6), updated by 
one period, the expected future price is given by 

EP t n I E( z ) = + Em, | /,(z)] + + Ev, \ I t (z)], 

where M t _ x and u ( _ x are assumed to be elements of the information set, 

m- 

The remaining elements to determine are the conditioned expectations 
of m, and v ,, given the observation on P,(z) which amounts, from Eq. (6), 
to an observation of the sum, n 2 m, + ir 3 v, + iT 4 e ( (z). These conditional 
expectations are, as in the earlier analysis, assuming that n 2 and tt 3 are 
nonzero, 

0 

Em, | / e (z) = -i [ir 2 m t + tt 3 v, + 7r 4 e ( (z)], 

7T 2 

where 



The above conditions can be used, in conjunction with the market 



236 


ROBERT J. BARRO 


clearing condition of Eq. (5) (which must hold here as an identity), to 
determine the ^-coefficients. The results are 

•7Ti = 1, 

= (0i + 0 S ) + fl8/«Xl - 0, - 0*) - (aA + <hP8J, 

— "alP, 

tt 4 = trjp, 

= IIP- 

Except for the tt, formula, these results correspond to those from the 
earlier model. The implied solutions for P t (z), EP t (z), and EP t+1 (=EP, +l ) 
are, assuming that tt 2 # 0, 8 

P,(z) - M t -i + [0i + 0 2 + 0S/a)(l - B 1 - 0*) - (aA + fi8j\ 
X [m, + (l/p)(v t + <( (z))) + (1/jB) «,_!, (7) 

EA(2) I /,(*) = ^<-1 + [0i + 0 2 + (J8/«X1 - 0i - 0*)1 

X [m ( + (l/p)(v t + € ( ( 2 »] + 0/P) "t- 1 , (8) 

£/> (+1 1 7 ( (z) = AT,.! + (0, + 0 2 )[m ( + (\/p)(v, + f( ( 2 ))] + (1/j3) . 

(9) 

The solution for EP,(z ) coincides with the price solution that would 
arise in the absence of indexation where a, = o 2 = 0. That is, the called- 
out price is determined in accordance with Eq. (7) in such a way that the 
expected amount of index adjustment, 

«i Em t | Ii(z) + a t Ev t | l t (z ) = (aA + atP e z)i™t + (1 IP)(v t + e t (z))], 

leaves EP t (z) unaffected by indexing. 7 It is also apparent from Eq. (9) 
that EP t+1 is independent of the indexing parameters, a x and a^. Since the 
supply and demand functions of Equations (2) and (3) depend only on 
EP t (z ) and EP M , it is apparent that output, y t (z), would be unaffected 
by the choice of % and a 2 . In particular, the output solution coincides 
with that of the nonindexed model and is given by 

yt(z) = (77/a)( 1 — 0i — 02) 

+ (i/«)[«, - (H/PM + e,)}[< + *Az)] 

+ (i/«)[ a< + (H/p)A + + €,*(2)] 

+ (P./P) uU + (Pa/P) ul a , (10) 

* If wj = 0, Pt(z) would be completely unresponsive lo current excess demand shifts, 
and would therefore convey no information about m, , v, or tfe). 

7 The parameters, 0, and 8 ,, are also independent of a, and a,. 



RATIONAL EXPECTATIONS MODEL 


237 


where 

H = ot,fi d — agp,. 

I 

In the earlier paper, I duscussed a number of aspects of the output 
solution. For present purposes I will address only the conclusion that this 
solution is unaffected by indexing. Consider the effects of money on output 
in the present type of model. Because is assumed to be fully perceived 
at date t, this variable has no effect on current output. Correspondingly, 
M t - 1 has a proportional effect on prices in Eqs. (7)—(9). However, m t is not 
perceived immediately as entirely a monetary disturbance because current 
information is limited to an observation of P t (z). Since P,(z) is also 
influenced by v t and e t (z), there will be some confusion in individuals’ 
attempts to separate observed price movements into their three underlying 
sources. To the extent that the m t disturbance is incorrectly viewed as a 
relative excess demand shift (which will depend on (1 — 6 1 — 0 a ), which 
is the fraction of <r f 2 in the total excess demand variance, /3 2 or m * + 
cr e 2 + <L 2 ), there will be an output response as indicated in Eq. (10). 8 -® 
The key to the short-run relation between money shocks and output in this 
type of model is incomplete current information, in particular, confusions 
about the relative sizes of aggregate versus market-specific disturbances. 
Indexing would be expected to alter this money/output relation only if it 
worked to change the flow of information to market participants. In the 
present setup, where the form and parameters of the indexing rule are 
known 10 and where the ex post price adjustment occurs only after the 
aggregate disturbances become general knowledge, it is clear that indexing 
does nothing to alter the underlying information structure. Accordingly, 
it follows that indexing does not affect the determination of (the entire 
probability distribution of) output. Put another way, the particular choice 
of indexing .parameters, a x and a a , would have no effect on any objective 
function that was based on the probability distribution of output across 
markets. 

The conclusion that the probability distribution of output is invariant 
to indexing can be generalized with respect to the form of the indexing 

* For present purposes, H > 0 can be assumed. See the discussion in the earlier 
paper. 

9 In a more general framework the confusion between m, and r, would induce an 
additional type of output response. That effect does not occur here because M % and u, 
are generated by processes of the same form (random walks), so that separating these 
two components is irrelevant for the prediction of P, +1 . See the earlier paper, 
Appendix I. 

18 If the indexing rule were itself stochastic, then the use of the rule would introduce 
extra noise into the system. Noisy indexing would be similar to noisy monetary policy, 
and would exacerbate individual information problems. 



238 


ROBERT J. BARRO 


rule and the form of the distributions of the stochastic terms. 11 Indexation 
amounts to replacing the called-out price, P t {z), in the supply and demand 
functions (Eqs. (2) and (3)) by the expected ex post indexing price, 
EP t (z) | 7i(z). M The adjustment of the called out price to clear the market 
in the nonindexed model is replaced by the adjustment of the expected 
ex post indexing price to clear the market in the indexed version. Since the 
expected ex post indexing price is as flexible as the called-out price, and 
since the two price concepts enter the indexed and nonindexed models, 
respectively, in a parallel fashion, it follows immediately that the equi¬ 
librium value of the expected ex post indexing price in the indexed model 
coincides with the equilibrium value of the called-out price in the non¬ 
indexed model. Given this correspondence, it also follows that the distri¬ 
bution of outputs in each market is invariant to indexing. Further, it is 
apparent from this argument that the output-invariance result is inde¬ 
pendent of the specific form of indexing or the form of the distributions 
of the stochastic terms. However, as discussed below, the result does 
depend on the absence of higher moments of the price distributions—for 
example, the variance of price across space or time—from the supply and 
demand functions. 


IV. Effects on Price Distributions 

The specific form of the model, as reflected in the price solution from 
Eq. (7), determines distributions of prices both over time and across 
markets. In this section I examine the implications of indexing for these 
price distributions. 

A. Relative Price Variance 

Consider, first, the distribution of prices over space. The called out 
price, P t {z), is determined from Eq. (7). The (geometric, unweighted) 
mean of these prices can be calculated by averaging out the relative shifts, 
«,(z), to obtain 

P f = Aft-1 + [01 + e t + 0/«)(i -e x - 6 2 ) - (aA + a 2 pw 

X [ m , + (l/£)t>,] + (]/£) , (11) 

Since the indexing rule of Eq. (1) applies equiproportionately to all 

11 1 owe this discussion to John Taylor. 

» Note that EP l+i | /,(z) = E{EP m U) i /,+,(*)] I /,(z). 



RATIONAL EXPECTATIONS MODEL 


239 


markets, the cross-sectional distribution of P£z) about P t coincides with 
that of P t (z) about P t . Hence, using Eqs. (7) and (11), one obtains 

P,(z) -Pt = Pt(z) - P t 

= A + 0 2 + (£/«)( 1 - 0i - 0*) - Mi + e t (z). 

(12) 

Defining the above bracketed expression by 

A ^ A + & 2 + (£/<*)( 1 - 0i - 0 2 ) - Mi + ^0*)], 

the relative price variance, denoted by t x *, is given by 

V = (A*US') a*. (13) 

It is apparent from the form of the ^-expression that the indexing 
parameters, a x and a 2 , can be chosen so that A, and hence r x *, approach 
arbitrarily close to zero. 13 For example, if indexation applies only to the 
monetary disturbances (a 2 — 0), the selection of 14 

a x ~ i + (l/w, + (/3/«)(i - e t - e 2 )] 

= 1 + (l//3*o**)[<r,* + (0/a) cr, 2 ] 

would compress the distribution of relative prices to one of arbitrarily 
small variance. When the shifts to aggregate excess demand, v t , are not 
indexed, the index on the monetary shift, m t , must be more than propor¬ 
tional in order to eliminate significant relative price dispersion. 

Starting from zero indexing A — a 2 — 0), it is clear from the form 
of Eq. (12) that increases in a x or a 2 would reduce the variance of relative 
prices. The reasoning is as follows. The indexing rule of Eq. (1) implies 
that P t {z) becomes more responsive to the aggregate shifts, m t and v t , 
when a x and a z increase. Since the supply and demand functions depend 
on EP t {z) | / e (z), market clearing can be maintained in this circumstance 
only if the called-out price, P t (z), becomes less sensitive to w* and v ( . 
Moreover, since current participants of market z observe these aggregate 
disturbances only as a blurred combination with the relative shift, 
e ( (z)—that is, as m, + (l/(9)[r, -f- « t (z)]—it must also be the case that 

l * The solution breaks down when A =*= 0, because P,(z) then conveys no current 
information. See note 6, above. If the indexing rule were not entirely systematic, this 
knife-edge property of information would seem to disappear. When A < 0, the solution 
■s unusual since called out prices (though not expected ex post indexing prices) then 
react negatively to excess demand shifts. However, the formal solution seems to be 
satisfactory for this case. 

14 The formulas for 0 X and 8 ,, from above, have been used here. 



240 


ROBERT J. BARRO 


(positive) indexing makes P t (z) (and therefore P t (z)) less sensitive to «*(*). 
This reduced sensitivity of individual market prices to market-specific 
shocks corresponds to the indexing-induced reduction in relative price 
variance. 

In the present framework, which does not deal with optimal search 
for information across markets, there is no direct allocative effect of this 
reduction in relative price dispersion. The compression of relative prices 
in this model does not imply any diminution of individual information 
about current aggregate or relative disturbances. More generally, one 
would anticipate that the reduced variance of prices across markets 
would imply a diminished incentive to spatial search, which would then 
interact with the determination of the spatial price distribution. However, 
it is not possible to analyze this type of effect until the framework is 
extended to deal with optimal search behavior. 

A second type of effect from reduced relative price variance would be 
a narrowing of (one-period) income distribution. In particular, 
“appropriate” indexing would lessen the current income implications of 
being a supplier to a high-priced market, a demander in a high-priced 
market, etc . 15 In the present setup, the distribution of income (and the 
prospect of a smaller variance for individual future income) does not 
affect market supply and demand, and therefore does not affect the 
determination of output across markets. However, in an extended model 
there may be some feedback from this income distribution effect to the 
determination of output. 

B. Predictability of Future Prices 

I consider here the question of predicting the future (indexed) price in 
a randomly selected market z', based on the current information set I t {z). 
That is, I examine the distribution of 

KiV) - eK 11 m- 

As in the earlier paper it is convenient to separate this prediction problem 
into three (independent) components by using the identity relation, 

P t+ 1 (z') - EP t+1 | /,(z) sh [P t+ 1 (z') - P t+1 ) + [P M - EP m | /,] 

+ [EP t+ 1 | I t - EP I+1 1 I t {z)], (14) 

u In order to simplify the information structure it was assumed that individuals 
were suppliers and demanders in the same market. In this situation a change in the 
variance of relative prices need not have any impact on the variance of relative incomes. 
However, the model can probably be extended to allow individuals to visit separate 
markets for supply and demand without altering any substantive results. 



RATIONAL EXPECTATIONS MODEL 


241 


where /, denotes the full current information set that includes observations 
on m t and v t . The variance of the three components will be denoted by 
ti®, <t®, and r g a , respectively, while the total variance will jbe denoted by V. 
The first component is the relative price distribution that has already been 
discussed above. It was shown there that appropriate (positive) indexation 
would reduce the relative price variance. The second component concerns 
predictions about the future {ex post indexing) aggregate price level, P t+1 , 
based on full current information. Since indexing increases the sensitivity 
of prices to aggregate shifts, it turns out that er®, the variance of the future 
absolute price level, increases with indexation. The third component, 
which deals with the distribution of information across markets, is 
independent of indexing in this model. Hence, the net effect of indexing 
on the predictability of the future price in an individual market depends 
on a balancing between reduced relative price variance A* down) and 
increased variance of the future absolute price level (a® up). 

Formally, the second component of Eq. (14) is, using Eqs. (11) and (1) 
(and the fact that I, includes observations on nt, and (;,), 

Pt +1 — EP <+i I h — A[m, +1 + {i/ft) n (+1 ] + + a t v t&1 , 

where, again, A == [0 X + 0 4 + (j3/a)(l — 6 X — 0*) — (aA + a*J30*)]. This 
component has zero mean and is independent of the first component, 
which is shown (after a substitution of e m (z') for e,(z)) in Eq. (12). The 
variance of the second component can be derived, after considerable 
algebra, to be 

a _ /terms that do not\_1_ 

° ~ l involve {a x , ) /9 2 [/S a o- m 2 + <j v 1 + cr e *]® 

X {{a x - Oii S)® (/8®o m ! ) + o/ + 2<7 t ®) 

+ o'WPvJ + afPo* + 2{ft?aJ + a.® + (0/a) a,®) 

X (a^oj + a 2 fto „*)]}. (15) 

In any case the main result from Eq. (15) is that cr® is higher for all positive 
values of a x and a t than it would be at a x = a a = 0. Indexation raises 
the prediction variance for the future absolute price level, P t+X . 

The third component of Eq. (14) is, using Eqs. (11), (1), and (9), 

EP t+1 | I t - EP t+1 \ I f (z) 

= m,+ {l/ft) v t - {6 x + «,)[»», + (1 /ft)(vt + e t {z))]. 

This component can be shown to have zero mean (conditioned on I t {z)), 



242 


ROBERT J. BARRO 


and to be independent of the first two components. It is apparent that the 
third component is also independent of the indexing parameters, <z x and a % , 
because indexing has no effect on relative information in this model. 
Accordingly, it is unnecessary, for present purposes, to deal further with 
the third component of Eq. (14). 

The total prediction variance for P t +i(z'), conditioned on /,(z), is the 
sum of the three component variances. Considering only the terms that 
depend on indexing, the result is 

_ /terms that do not\_1_ 

~ \ involve (a x , a t ) ) pHJPo* 2 + a 2 + ct, 2 ) 

X [(«, - aSY (poj) <r» 2 + v.WPoJ + aSpo,*)). (16) 

In particular, since the bracketed expression in Eq. (16) is nonnegative, 
the result implies that any amount of indexing (a, and/or a 2 =£ 0) unam¬ 
biguously raises the prediction variance for the indexed future price in an 
individual market, P, ,.i(z'). The minimum value of V is attained at 

aj = a 2 = 0. 

It seems clear intuitively that (positive) indexing would raise ct 2 —the 
prediction variance for P i+1 conditioned on full current information — 
since indexing makes prices more responsive to aggregate shocks, and the 
distribution of the future aggregate shocks is unchanged by indexing. 
Along the same lines, there is a reduced sensitivity of prices to relative 
disturbances (under some amount of positive indexing), which implies 
that the future price in an individual market could be predicted more 
easily once the aggregate price were known. The net of these two effects 
would seem to make ambiguous the overall effect of indexing on the 
prediction variance for J?t+i( z ')- In fact, it turns out to be unambiguous 
that indexing (of any sign) makes this prediction problem more difficult. 
At this point, I do not have an intuitive explanation for this result. 

In a larger model, the predictability of future prices might have direct 
effects on the supply and demand functions, so that a change in this 
predictability would alter the determination of output. For example, 
the costs of nominal contracting and the general use of money may depend 
on future price predictability. However, the exploration of these types of 
effects would first require an analysis of the effects of price predictability 
on underlying supply/demand behavior. 

V. Conclusions 

This paper has analyzed the effects of indexing on price and quantity 
determination in a market-clearing framework that incorporates rational 



RATIONAL EXPECTATIONS MODEL 


243 


formation of expectations. Although the model is a simple one that 
excludes long-term contracting and other potential sources of price 
rigidity, the model does contain sufficient limitations, on the flow of 
aggregate information to generate a Phillips curve-type relationship. 
Indexing is introduced into the model as an ex post adjustment of local 
prices in accordance with aggregate variables that are perceived only 
with a lag. The indexing rule is taken into account by individuals who 
base their current supply and demand decisions on anticipated, ex post 
indexing prices. Because indexing does nothing to speed up the flow of 
information (the underlying element in the Phillips curve in this type of 
model), it turns out that indexing also has no effect on (the entire 
probability distribution of) output. Therefore, the model gives no support 
to the familiar hypothesis that indexing would moderate the Phillips curve. 

There are some effects of indexing on the distribution of (indexed) prices 
across space and time. First, because indexing increases the sensitivity 
of local prices to aggregate shocks, there is an increase in the variance of 
future prices about their currently predictable values. Second, indexing 
can reduce the dispersion of prices across markets. The reasoning is as 
follows. Since ex post indexing makes local prices more responsive to 
aggregate shocks, it is necessary—in order to maintain (ex ante) market 
clearing—that ex ante local prices be less sensitive to aggregate shocks. 
However, since local shocks appear ex ante to individuals as a blurred 
combination with the aggregate shocks, it also follows that indexing makes 
(ex ante and ex post) local prices less sensitive to local shocks. This 
reduced sensitivity implies the reduction in price dispersion across markets. 
In a richer model, it is possible that these effects of indexing on price 
distributions could lead also to effects on the determination of output. 


References 

1. C. Azariadis, Implicit contracts and underemployment equilibria, J. Political 
Economy 83(1975), 1183-1202. 

2. M. N. Baily, Wages and employment under uncertain demand, Rev. Econ. Studies 
41 (1974), 37-50. 

3. R. J. Barro, Rational expectations and the role of monetary policy, J. Monetary 
Economics 2 (1976), 1-32. 

4. R. J. Barro, Long-term contracting, sticky prices, and monetary policy, unpublish¬ 
ed, 1976. 

5. S. Fischer, On Some Theoretical Considerations, in “The Role of Indexation,” 
(A. Swoboda, Ed.), Geneva, International Center for Monetary and Banking 
Studies, 1974. 

6. S. Fischer, Long-term contracts, rational expectations and the optimal money 
supply rule, unpublished, 1975. 

7. M. Friedman, The role of monetary policy, Amer. Econ. Rev. (1968), 1-17. 



244 


ROBERT 1. BARRO 


8. M. Friedman, Monetary correction, in American Enterprise Institute, “Essays on 
Inflation and Indexation,” Washington, D. C., 1974. 

9. D. Gordon, A neo-classical theory of Keynesian unemployment, Econ. Inquiry 12 
(1974), 431-59. 

10. J. A. Gray, Wage indexation: a macroeconomic approach, J. Monetary Economics 
2 (1976). 

11. R. E. Lucas, Expectations and the neutrality of money, J. Economic Theory, April 
1972, 103-24. 

12. R. E. Lucas, Some international evidence on output-inflation tradeoffs, Amer. 
Econ. Rev. (1973), 326-34. 

13. E. S. Phelps, The new microeconomics in employment and inflation theory, in 
“Microeconomic Foundations of Employment and Inflation Theory,” (Phelps, 
et a!., Eds.), Norton, New York, 1970. 



JOURNAL Of ECONOMIC THEORY 13, 245-263 (1976) 


Portfolio Selection with Transactions Costs* 
Michael J. P. Magill 

Department of Economics, Indiana University, Bloomington, Indiana 47401 

AND 

George M. Constantinides 

Graduate School of Industrial Administration, Carnegie-Mellon University, 
Pittsburgh, Pennsylvania 15213 


1. Introduction 

In several recent contributions [2, 3, 4] Merton has shown that the 
continuous time formulation of portfolio theory provides a powerful 
analytical framework for extending the standard results of one-period 
mean-variance portfolio theory to the dynamical case. As is by now 
familiar the simplifications introduced by the continuous time theory have 
their origin in Samuelson’s basic Approximation Theorem [5], for the 
mean-variance solution provides the exact solution in the limit of infini¬ 
tesimal time periods. 1 Thus when security prices are lognormally distri¬ 
buted the Tobin-Cass-Stiglitz Separation Theorem as well as the Sharpe- 
Lintner-Mossin capital market equilibrium theory can be extended in a 
natural manner to the dynamical case. 8 The simplicity with which the 
earlier mean-variance results can be extended to the dynamical case is 
certainly a strong point in favor of the continuous time analysis. 

The most important empirical justification for the use of continuous 
time analysis arises from a structural property common to most well- 

* This paper reports results of our joint work, some of which also appears in 
Constantinides’ dissertation [I]. We are grateful to Robert C. Merton and John F. 
Muth for valuable discussion. Needless to say we remain responsible for all remaining 
errors. 

1 In all the lengthy discussion of the applicability of mean-variance theory the 
Approximation Theorem provides by far the most fundamental justification for the 
use of mean-variance theory. For it leads naturally to an analysis of continuous time 
diffusion processes—processes which are completely characterized by their instantaneous 
mean and variance, see [6, Chap. 8], 

•See 13,4], 

245 

Copyright © 1976 by Academic Prerne. Inc. 

All rights of reproduction in any form reserved. 



246 


MAG1LL AND CONSTANTINIDES 


developed capital markets: trading opportunities in securities are available 
continuously in time. Rational investors will then wish to avail themselves 
of the opportunity of trading at every instant of time. But herein lies the 
principal weakness of Merton’s formulation of the continuous time 
theory. 8 For by combining the assumption that trading opportunities 
are available continuously with the assumption that the trading oppor¬ 
tunities are available costlessly the investor is led to a quite unrealistic 
type of portfolio behavior. In the absence of any transactions costs the 
continuous time theory predicts that an investor faced with continually 
varying security prices will indulge in a completely unrealistic amount 
of security trading. Indeed it is for this reason that the discrete time theory 1 
is often adhered to as a more reasonable and realistic explanation for 
observed investor behavior since the investor trades only at suitably spaced 
discrete intervals of time. 

It is the object of this paper to show, however, that this weakness of 
Merton’s continuous time theory is readily overcome by explicitly intro¬ 
ducing into the analysis the impact of transactions costs. For when such 
transactions costs are introduced it will be found that the investor only 
seeks to make use of the available trading opportunities at randomly 
spaced instants of time—a behavior pattern which accords much more 
readily with observed investor behavior. Indeed since the discrete theory 8 
only allows the investor the option of trading at preassigned intervals 
of time while in well-developed capital markets trading opportunities 
are available continuously, there are strong grounds for believing that 
the continuous time theory more accurately reflects both the trading 
opportunities available and the associated investor behavior that is 
observed on well-developed capital markets. Thus while there has been a 
tendency to focus more attention on the discrete time theory on the 
grounds that the continuous time theory is unnecessarily complex, we 
would argue that with the introduction of transactions costs thf continuous 
time theory provides both on theoretical and empirical grounds the most 
realistic image of investor behavior that has been available so far. 

In the formal solution which emerges it is found that the investor 

* It should be pointed out that Merton was very well aware of this weakness of the 
continuous time theory, see [4, p. 869], 

4 For an analysis in discrete time see [7]. 

6 The use of the discrete time theory as opposed to the continuous time theory can 
really only be justified when the trading interval h is not taken to be very “small" 
(Merton suggests h — 1 /270 of a year [4, p. 869J). For if the discrete theory is defined 
for every h and if the discrete theory converges to a well defined continuous time 
process as f -*■ 0 then on theoretical grounds (the Approximation Theorem) and on 
empirical grounds (continuous trading opportunities) the continuous time theory is 
likely to be preferred. 



PORTFOLIO SELECTION 


247 


trades in securities when the variation in the underlying security prices 
forces his portfolio proportions outside a certain region about the optimal 
proportions in the absence of transactions costs. The 'Solution is related 
in an interesting way to the classic Arrow-Harris-Marschak [8] and 
Bellman-Glicksberg-Gross [9] analyses of the commodity inventory 
problem .* Indeed some of the earlier papers examining the impact of 
transactions costs have relied heavily on this analogy between the portfolio 
and the inventory problem. The classic analysis of Baumol [10] can be 
viewed as a translation of earlier results in deterministic inventory theory 
[11] into corresponding results on the demand for money. The extension 
of this analysis to an environment of uncertainty by Miller and Orr [12] 
in the case of fixed transactions costs and by Eppen and Fama [13] for the 
case of proportional costs similarly depended strongly on the earlier 
results [8, 9] in inventory theory. Zabel [14] who considered a discrete 
two-period two-asset (cash, security) model where the consumer maxi¬ 
mizes the expected utility of his consumption explicitly considers the 
attitude toward risk of the investor as well as the cash-security composition 
of his portfolio, rather than just the stock of cash as in [10, 12, 13]. In 
this respect the present analysis is similar to that of Zabel. The method of 
analysis developed in this paper is, however, quite different from that of 
Zabel and enables us to obtain an exact characterization of the individual’s 
portfolio behavior for an arbitrary number of securities and for an 
arbitrary time horizon. 

Section 2 formalizes the portfolio problem in the presence of transactions 
costs making explicit the underlying assumptions about the capital market 
and the individual investor. In Section 3 we derive the optimal portfolio 
policy which is characterized first in the case where the portfolio propor¬ 
tions in the absence of transactions costs are small, and subsequently for 
the general case. The paper concludes with some observations on the effect 
of transactions costs on the general theory of the capital market. 


2. The Portfolio Problem with Transactions Costs 

Consider an investor who faces a capital market with the following 
properties. 

‘ That the portfolio theory should be related in this way to inventory theory is really 
not surprising, for we can view the investor’s portfolio as an inventory of securities 
which instead of being continually depleted by a random demand is depleted or aug¬ 
mented at random as a result of the random fluctuations in the underlying security 
prices. The problem of determining when to realign the portfolio proportions is then 
equivalent to the problem of determining when to reorder stocks for the basic inventory 


642/13/2-6 



248 


MAGILL AND CONSTA.NTINIDES 


Assumption 1 ( Continuous Competitive Markets). A fixed number m 
of securities can be bought and sold at current prices in unlimited amounts. 
A bank (security) is available to all investors which pays a constant interest 
rate (r > 0) on deposits and charges the same rate on borrowing, which 
is available in unlimited amount. Trading takes place continuously in time. 

Assumption 2 ( Securities ). Each security is perfectly divisible. The 
value of the bank security is unchanged (no inflation or deflation). The 
prices of the remaining securities are lognormally distributed, all instan¬ 
taneous variances are positive and all instantaneous correlations are less 
than one in absolute value. 

Assumption 3 ( Information ). Ail information concerning the under¬ 
lying probability distribution of security prices as well as current quotations 
of security prices is perfect information that is available continuously and 
costlessly to all investors. 

Assumption 4 ( Transactions Costs). Transactions costs are incurred 
in the purchase or sale of each security. The costs are proportional to the 
value of each transaction. 7 Thus if v\ denotes the value of the z'th security 
purchased (c< > 0) or sold (v { < 0) per unit of time the transactions cost 
function T(t\ ,..., v r „) indicating the cost of buying or selling any combi¬ 
nation of the m securities is given by 

v m ) - £ X v ( Vi where - j* l* Co 

and where 0 ^ X‘ < h 0 g Xi < 1. i = 1.•••, m. 

The following assumption is made concerning the investor. 

Assumption 5 ( Income and Lifespan). The investor has an expected 
lifespan [0, T] during which he expects to earn a flow of contractual 
income y(t), where y(t) is a continuous function on the interval (0, T\. The 

7 Since the cost of buying or selling a given security is attributable to two separate 
costs, the broker’s commission and the bid-asked spread [15], a fully realistic trans¬ 
actions cost function should be the sum of a concave brokerage cost with a discontinuity 
at the origin depending on the number of securities transacted and a proportional 
spread cost as in (1). Assumption 4 considers the special case where transactions costs 
are generated solely by the spread cost. In the analysis which follows it is not necessary 
or adviseable however to impute such a narrow or specific interpretation to the trans¬ 
actions costs—for they can be any costs that are associated with the purchase and sale 
of securities and can be interpreted to include a much more general class of costs such 
as information costs, taxes, and the like. 



PORTFOLIO SELECTION 


249 


investor acts as if both T and y(t) were known with certainty in advance 
at t — 0. 

If pi(t) denotes the price of the rth security and x^t) the number of its 

securities held by the investor at time t, i = 0,1. m, then s t (t) = x ( (t)pft) 

is the value of his holdings of this security at time t. By Assumption 2 
the value of the bank security p 0 (t) is constant for all t, while at each 
instant t the security prices Pi(t),..., p m (t) satisfy the joint diffusion process 

dp t {t) = XiPAt) dt -f Pi(t) dz { (t) i = 1,..., m, (2) 

where dz(t) = (dz x (t),..., dz m (t)) is the increment of a Brownian motion 
process, so that for any partition 0 = t 0 < t x < <4 = 7’ of the 

interval [0, T] the random variables 

z('i) — z(t 0 ), ~, z(4) — z(4-i) 

are independent and normally distributed with mean 

E[z(t { ) - z(f<_,)] =0, i = 1. k 

and covariance matrix 

E[(z(t ( ) - - z(ti-i))’] = Z(U - U- 1 ) i = 1. k, 

where E is positive definite. 8 On all those subintervals of [0, T] where 

Xi(t), x t (t) = dXi{t)jdt are continuous, i = 1. m, Ito's Lemma • can be 

applied to sft) — x ( (t) p { (t) so that 

ds t (t) = (oiiSft) + »<(/)) dt + sft) dz t (i) i = 1,..., m, (3) 

where v f (t) = x,(t) p<(t) denotes the transaction rate for the fth security 
at time t. If c(t) denotes the investor’s flow of consumption expenditure 
at time t, since income is paid in cash and since both consumption expen¬ 
diture and transactions costs must be financed from his stock of cash 
while the purchase (sale) of securities reduces (adds to) his stock of cash. 
Assumptions 1, 4, and 5 imply 

ds 0 (t) = |rr 0 (f) + >>(?) - c(t) - £ (1 + *„,) t>,(0j dt. (4) 

Let s — (j 0 s m ) and let the investor choose a transaction-con¬ 
sumption policy of the form (v, c ) = (v(s, t), c(s, t )), t e [0, T] where 

* (2) is equivalent to the m stochastic integral equations 

PiO = P<(0) + <*< f Pi(9)d6 + f p t (0)dz ( (P) i = 1. m, 

J 0 Jo 

where the second integral is the Ito stochastic integral of p,(t) see [6, Chap. 8J. 

* For a statement and proof of Ito’s Lemma see [6, pp. 386-391]. 



250 


MAGILL AND CONSTANTINIDES 


v = (t?j v m ). Then (3) and (4) lead to an associated conditional pro¬ 
bability density function ip <v - e) (c, t j c(0)) for the path of consumption. 10 
We make thefollowing crucial assumption about the investor’s preferences. 

Assumption 6 (Preferences ). The investor has a preference ordering 
t/(^ ( *’- e >) among the probability distributions ^ V -°K Furthermore there 
exists a utility function u(c, t) such that the preference ordering can be 
represented 11 as follows 

U(tp (r - C) ) = f f u(c, r) i/t ( * cl (c, r | c(0)) dc dr. (5) 

^ 0 •'—CO 

Under Assumption 6 a rational investor will choose his transaction- 
consumption policy over his lifespan [0, T\ so as to maximize (5). This is 
equivalent to maximizing 

Efctf f r u(c,r)dT (6) 

•'o 

subject to (3), (4), and the initial condition (s, 0) where c(s, 0) = c(0) 
and where denotes the conditional expectation given the transaction- 
consumption policy (v, c) over the time interval [0, T] and that his holdings 
of securities are r at time t. 

We will now introduce a procedure which makes it possible to solve 
the above problem using the stochastic theory of control. 12 We shall 
consider (3) as the limit of the following equations as e -* 0+ (implying e 
converges to zero through positive values) 1 * 

dSi(t) = (a,s,(f) + t\(f)) dt + (J,(0 + eVi(t)) dz,(t) i = 1,..., m. (7) 
Then we have the following result 

10 Given ( v , c) = (v(s, t), c(s, t)) (3) and (4) lead to a well-defined diffusion process 
for ds. Applying Ito’s Lemma we find the diffusion process dc, 4>"' , ‘Kc, t 1 c(0)) is then 
the solution of the forward Kolmogorov (Fokker-PJanck) equation associated with the 
diffusion process dc, see [6, Chap. 8], 

11 Sufficient conditions for such a representation have not yet been given. For a 
discussion of the static case where the Von-Neumann-Morgenstem Axioms are sufficient 
sec [16, Chap. Ill], Note that (5) is time-additive and implies no bequest motive. 

11 The limiting procedure which is introduced here is useful and interesting in its 
own right as a general method of solving stochastic control problems In which the controls 
enter linearly but do not directly affect the disturbance terms. The method has not 
appeared before in either the economic or the stochastic control theory literature and 
should prove useful in solving problems of this kind. 

“ It can be shown that the optimal transaction policy v* defined by (17) has the 
following properties: tv,* is finite and bounded as < -*-0 + and *v t * -* 0 as f tends to the 
boundary of O c . For the limiting operations employed in the proof of Proposition 1, 
(7) is thus a valid representation of the process (3). 



PORTFOLIO SELECTION 


251 


Theorem l. 14 If u(c, r ) = e-^uic), if the maximum in (6) exists and 
if {v, c) maximizes (6) subject to (4) and (7) then the value function 


satisfies 


max 

(v,c) 


W(s, t ) = max E{ V J f* e~^~"u(c) dr 
w(c) + £ + Vi) + W 0 (rs 0 + y - c - £ (1 + *„,) 

1-1 \ i-1 / 

+ i t W iS (s x + *v t )(s t + e Vi ) a (j - P W + W t \ = 0, (8) 

l.i -1 ) 


lV(s, T ) = 0, where 
dW 


W, 


8s, ’ 


W ti = 


8*W 
8s { dsj ’ 


27 = 


-'ll 


L u ml 


If the maximum in (8) is well defined then the maximizing (v*, c*) must 
satisfy u c (c*) — W a = 0, 


w, ~ W 0 {\ + Xv) + I »WC + «’,*) — 0 j = 1.in 

<-i 

implying 

v * = (!/«){(!/«) + X.) - H'#) - 




C* = u 

7 l (w 0 \ 

U ce * °- 


where 







' u •• 


1 »- 

(1. 1), 

Xv = (xv t 

Q = 



$ 

11 

W fvj, 

S = 5 m ). 


3. The Optimal Portfolio Policy 

We will use Theorem 1 to determine the nature of the investor’s portfolio 
and consumption policies under the additional assumption that the utility 

14 Fora proof of Theorem 1 see [17], A heuristic proof is easily established by applying 
bellman's Principle to the definition of the value function tV(s, t) and then using Ito’s 
Lemma. The assumption u{c, t) = «-»• u(c) is introduced here so as to reduce equation 
(8) to a form that is simpler to solve in Section 3. It may be considered, as part of 
Assumption 7. 



252 


MAGEX AND CONSTANTOODES 


function is a member of the following family characterizing an investor 
with decreasing absolute risk aversion.^ 


Assumption 7 (Utility function). 

u(c, t) = (1 ~ V - (-y—— + y(t)) 


= e~<"(\ - t?) 1 -” (c £(r))\ c 5* £(r), (10) 


where 


£(r) = —y(r) 


V-v) 

P 


-CO < 7 ) < 1, |8 > 0, 

— oo < y(r) < 00 , p > 0. 


Equations (8), (9), and (10) imply that the value function fV(s, i) 
satisfies the equation 


L -~~ (-<f) + X + ^«("o + y - 0 - p w 

V ' P ' 

+ W t + (Wfn + x .) - W,yi 

- i (^o(» + X.) - Wii Q-\Wfn + Xv ) -W,) = 0 (11) 


with boundary condition W(s, T) — 0. Equation (11) has a solution of the 
form 

ms, 0-&L+1 b iSi + 4(r)V'. (12) 

V \ <-l / 

Equation (12) implies that 

rs o + y — £ + 1 + (1/ 6 )(1 + Xv, — fi()] + d(t) 

s o + ZZt V. + ~ 

must be constant, so that 

bi = i = 1.-. rn, A(t) = Y(t) - C(t), (13) 

where 

Y(t) = J T y(r) e- r(T -» dr, C(t) = j* £(r) e-^~" dr. 

16 The Arrow-Pratt [18, 19] measure of absolute risk aversion —u„Ju, is positive 
and decreasing. When q > 1 absolute risk aversion is increasing: this however seems 
to be an unlikely attitude toward risk (see [18, pp. 90-98]) and furthermore would 
give rise to perverse behavior in the analysis that follows. For a complete analysis of the 
properties of this family see [3,20]. 



PORTFOLIO SELECTION 


253 


y(r) and C(t) arc just the present value of his future income stream and 
his future minimum required consumption stream (when c(r) > 0) 

respectively. Noting that ^ 

= - IX-' -r b i s i + 


' 1 

0 


' 1 

0 


2 1 - 1 

bi 

0 

1 

b m _ 


0 

1 

_ 


jf we let a = (ajO, 9 = 0*- "O' ^ _1 (“ - rn)l( 2(1 - 17 )), then ( 11 ) 
reduces to the familiar Bernouilli equation 


( n \Ti/(-n—n 

-j) = o, fl(r) = 0. (14) 

Equations (13) and (14) imply 

=< (*. + i-nr^—S + no - «o)* a 5) 

which is well defined for all tj < 1 provided that 

p > rj(r + q) (16) 

and where the dependence of % Vi on s remains to be determined. Note that 
(15) coincides with Merton’s solution [21] when x* t = * — 0. (9) and 
(15) imply 


1 — e(ot 1 — r ) 
1 + Xvl 


0 


1 — fO*m — r) 

1 + Xv m J 


X 


- rn) ( - (1 + X J s t ^ \ 

r^T" I s - + £ 1 -.(«, - r> + r(,) C(,) j 


(17) 



254 


MAGILL AND CONSTANTINIDES 


Equation (17) is a remarkably compact set of linear equations, which 
contains the basic information used to characterize the individual's 
transaction policy. The analysis of (17) is simplified if we divide through by 

w(0 = I HO + Y(0 ~ £(0. 

<-0 

which may be called the effective wealth of the individual . 14 Let $ ( = sfw, 
i = 0 ,..., m, £ = (i t i m ), i v = (Y — C)/w so that = 1- 

Provided w ^ 0 we can write (17) as u*/w = (1/e) v*(£, x„, «)■ If we 
define 

i° = os) 

since «(aj — r) —<► 0 as e —*■ 0 + , 

x*.«) [j~^7ZT7y] v *^' x«) as c ° f > 

where 

v/(f, Xt) = [x,/ 6 ° - 1 ) - 1 ] 6 + fc°(l + I X,/<) 7 = 1 ,-, nt. 

,*) ( 19 ) 

The functions x„) are essentially signal functions which immediately 
signal when and how the securities are to be traded. To see how these 
signal functions work consider the simplest case, namely when trans¬ 
actions costs are zero. Since v y *(£, 0) = if — £,■, (17) implies that when¬ 
ever ij < if, v* should be such that i t approaches if at a rate dependent 
on e. As e —*■ 0+, v, * —► oo in such a way that i f is raised to if instan¬ 
taneously. Similarly whenever f, > if as £ -+ 0+, vf —>• — oo in such a 
way that i, is lowered to if instantaneously. Thus by taking the limit 
as e —► 0+ in (17) we find that Theorem 1 implies that the optimal port¬ 
folio policy in the absence of transactions costs consists in adjusting the 
vector of portfolio proportions i so that i = i° at every instant. Since 
i° coincides with the optimal portfolio proportions in [2, 3], Theorem 1, 
the signal functions v,*, and (17) lead to an alternative derivation of 
Merton’s portfolio policy. Notice that the signal functions Vi* bring out 

14 Wealth can be defined in a number of ways. Merton chooses to let w(r) = d7-o HO- 
It can be argued that a natural definition should also include X(r). If c(r) > 0 then 
it also seems natural to subtract the preplanned consumption 6(t) from the future 
income y(t) so as to obtain the capital value of net income Y(t) — C(t) which is then 
added to the current value of his financial assets to obtain his wealth. Ultimately the 
definition is a matter of convenience. In this respect this definition greatly simplifies 
the subsequent analysis. 



PORTFOLIO SELECTION 


255 


very clearly the massive amount of trading that takes place over time in 
the absence of transactions costs. » 

When transactions costs are present the terms £/ 2^ x*,£i in v i*(£, X •) 
make the analysis considerably more complex. However when the port¬ 
folio proportions £/ are sufficiently small these terms become unimportant. 
This leads to 

Proposition l. 17 If Assumptions 1-7 are satisfied and if j £/1 are 

sufficiently small (If/| <^ 1 ) j =1 . m then there exists a region 

Q 0 C R m such that the investor always confines his portfolio proportions 
to this region 



Q 0 ={£eR” 

= 1 , m). 

where 





if &°>0, 



if 6 °<°. 

Proof. 

By Theorem 1 the optimal transaction policy under Assump- 


tion 7 is given by (17). If f/ are sufficiently small the signal functions reduce 
to Vj*(t xf) = £/ - (1 + *,,) ii ■ Suppose f/ > 0, since vf > 0 implies 
Xv, = X’, vf>0 whenever f, < f//(l + gO- Similarly since v f * < 0 
implies Xv, — ~Xt and since xi < 1 implies 1 — Xt > 0, v,* <0 
whenever f/ > f//(l — Xi)- Suppose f, < f//(l + x0 then b y O 7 ) 
vf is such that f, approaches f//( 1 -f xO at a rate dependent on e. 
As e -*■ 0 + Vf * -*■ co in such a way that is raised instantly to f//(l + x’)- 18 

17 Propositions 1 and 2 determine the investor’s transaction policy under the as¬ 
sumption that the initial portfolio proportions lie in the region O 0 . It appears that 
there are conditions under which it is not optimal to transact to the boundary of S2„ 
if f is not initially in ii 0 . For example an investor with a very short lifespan facing a 
high transaction cost rate and starting with all holdings in cash may not find it worth¬ 
while to purchase the risky securities. If we let — (c — cX« TC /« 0 ) = 1 — 17 be a measure 
of relative risk aversion then the condition that the portfolio proportions f° be sufficient¬ 
ly small is equivalent to the condition that the investor be sufficiently risk averse. 
Recall that the usual measure of relative risk aversion is —c(u„lu,), see [18, 19]. 

18 The limiting process involved here is the same as that involved in the definition 
of the Dirac Delta function, the integral of which is the unit step function [22, pp. 
22-26]. If s,{t~) and sft+) denote the holdings of the /th security before and after the 
transaction at time t, then 

(•i+X 

lim v,*(r)dT = s,{t+) - 
H-0 J t—h 

so that x% r*( ,+ ) — */(*")) is the transaction cost incurred if v t *> 0. It should be recalled 
that since f is the solution of a diffusion process, its velocity is infinite. It is for this 
reason that transactions must be undertaken at an infinite rate whenever f attempts to 
penetrate the boundary of 0 0 . 




256 


MAGILL AND CONST ANTTNIDES 


For as soon as & — £/*/( 1 + x0> x e ) = 6 ° — (1 + xO & = 0 
implying * =0. The rest is immediate. 

Suppose < 0 and suppose < £//( 1 — x#)- Since v,*(£, X®) — 
6 °-(i + x0& > 0, ty* > 0 is clearly optimal. Suppose the investor 
trades until vfi(i, x„) = if — (1 + xO 6 = 0 s ° that ^ is raised to 
ii = f/7(l + xO- Since at this point vfi(i, x») = 6 ° — (1 — Xi) it < 0, 
Vj* < 0 now becomes optimal and the investor trades until & = 
£,°/(l - Xt) at which point v,*(£, \v) = 6 ° — (1 — Xi) ii = 0. But then 
it is clearly not optimal to let £, exceed £,°/(l — Xi) for exceeding this 
point involves a redundant transactions charge since the investor always 
finds it optimal to return to this point. Thus when f, < £//( 1 — yj) the 
investor trades to i t = ifj(\ — Xi)- Similarly when i, > £//( 1 + xO the 
investor trades to 1/ — if 1(1 H- xO- Since it cannot be optimal to 
repeatedly trade both ways, when £, e [if 1(1 — xi), ifl( 1 + x 01 the 
investor refrains from transacting. | 

Proposition 1 has a straightforward economic interpretation. Since the 
prices of the securities are continually changing according to ( 2 ) the 
portfolio proportions i are continually changing. In Merton’s case, since 
there are no transactions costs whenever i ^ i°, the benefit to be gained 
from improved diversification always induces the investor to transact 
so as to return i to i°. As soon as the investor is faced with transactions 
costs, however, he must match the benefits of improved diversification against 
the associated transactions costs. Thus whenever the prices move i around 
but i still lies in the central region Q 0 about i°, the investor does not find 
it worthwhile to alter £; in this region the transactions costs would exceed 
the benefits from improved diversification. But as soon as i pierces the 
boundary of Q 0 the investor finds it worthwhile to transact so as to bring 

i back to the boundary of . In this case the benefits of improved diversi¬ 
fication outweigh the transactions costs. 

It is interesting to note that the transaction policy of Proposition 1 is 
of exactly the same form as the Bellman-Glicksberg-Gross ordering policy 
for the infinite horizon multicommodity inventory problem with propor¬ 
tional ordering costs, stated as [9, Theorem 3]. Indeed the portfolio policy 
of Proposition 1 also has an important simplifying independence property 
akin to the property that Bellman-Glicksberg-Gross refer to as sub¬ 
optimality. This independence property only holds in the present context 
however when the portfolio proportions are small, as assumed in Propo¬ 
sition 1 . 

Corollary 1. The interval to which the transaction policy confines 
the portfolio proportion i } of the jth security is independent of the proportions 

ii and the transaction cost rates y\ Xif or other securities i # j. 



PORTFOLIO SELECTION 


257 


This property is of great importance on purely empirical grounds. For 
it is the essential property that is required if the portfolio policy is to have' 
a reasonable and manageable form in the presence of transactions costs. 
As is shown in the proof of Proposition 2, in the general case where the 
portfolio proportions are not necessarily small the interval to which 
is confined depends in a very complex way on and %*, xt, i ¥^J. The 
complexity of the region in the general case makes it highly unlikely that 
even the most rational of investors would involve himself with such cal¬ 
culations. 

Corollary 2. The region Q 0 is independent of the investor's wealth 
and independent of the length of his remaining lifespan. 1 * 

Both of these properties which hold independently of the magnitude of 
the portfolio proportions, arise from the homogeneity property charac¬ 
teristic of the HARA (hyperbolic absolute risk aversion) family of utility 
functions [3]. These properties generalize to the case of transactions costs 
two results whose importance was first stressed by Samuelson [23]. 
The first is that contrary to the advice of much investment literature the 
fact that the businessman is more wealthy than the widow does not imply 
that their portfolios should differ with respect to the risk that they carry, 
the businessman for example accepting a higher risk portfolio for the sake 
of obtaining a better yield. Secondly the fact that the businessman has a 
longer life ahead of him than the widow does not imply that the business¬ 
man should be prepared to invest more heavily in the risky securities. 
For the HARA family the proportion of his wealth that an investor carries 
in the risky securities is independent of his age. This time independence 
property of the portfolio policy is one important respect in which Propo¬ 
sition 1 differs from the Bellman-Glicksberg-Gross Theorem 3. For in 
[9] the critical levels at which stocks are reordered depend in the case of 
a finite planning horizon on the number of years left to the end of the plan 
and are constant only when the horizon is infinite. For a more general 
family of utility functions one would expect the same result for the port¬ 
folio problem. Indeed one would expect the size of the region Q 0 about 

to decrease as T — t increases so that the longer the remaining life¬ 
span of the investor the greater his propensity to transact. 

Inside the region Q q the portfolio proportions £(t) = s(t)jw(t) describe 
a diffusion process the nature of which is determined by (4) and (7). Since 

“Corollary 2 should be carefully distinguished from the Tobin-Cass-Stiglitz Se¬ 
paration Theorem [20] which in its simplest form asserts that the composition of the 
portfolio of risky assets is independent of the investor’s preferences, age, or financial 
assets. This result is the subject of a separate analysis in [24]. 



258 


MAGILL AND CONSTANTINIDES 


the covariance matrix E is positive definite and since Q a is a closed, 
bounded region about £° the process £(t) will pierce the boundaries of 
Q a at random instants. Since the length of the interval K(£f) to which the 
proportion £j is confined increases as Xi increase, it is clear that the 
average frequency per unit of time with which £{t) pierces the boundaries 
of Kit, 0 ) decreases as x’, xi increase. Conversely in the limit as x*, Xi -* 0 
the investor trades continuously in the yth security. 

Corollary 3. (i) When x*> Xi > 0 the investor trades in the jth 
security at randomly spaced instants of time,j = 1 ,..., m. 

(ii) The average frequency of trading in any security per unit of 
time decreases as the cost of transacting the security increases . 20 

The portfolio policy of Proposition 1 partitions the portfolio space R m 
into ( k ) 2 k distinct regions 

Q k = {£ e R" 1 \ f K(£f), k indices./} k = 0,..., m, 

where each Q k may be called a k-transaction region since whenever £ e Q k , 
k securities are transacted, the remaining ( m — k) involving no trans¬ 
action. R m is thus partitioned into 3 ra distinct regions . 21 Q 0 is the m-dimen- 

sional rectangular solid about £° with sides of length 2 Xi£i°, i — 1 . m 

(assuming x‘ = Xi to be relatively small). The regions Q k surround 
Q 0 and Q k n Q 0 are its (m — k) dimensional hyperfaces. As soon as a 
change in security prices causes £ to pierce one of the hyperfaces Q k n Q 0 , 
k securities are transacted and £ is driven back to the hyperface. As 
X*, Xt 0 * = !>•••» m the rectangular solid £2 0 shrinks to the point £° 
and we are back to Merton’s case. Figure 1 shows the regions Q k when 
m = 2, x‘ = Xt = X> i == i. 2. 22 

*° Since the probability distribution for the frequency of trading depends crucially 
on the magnitude of the transaction cost rates relative to the return-covariance structure 
of the diffusion process (4), (7), it is clear that the empirical magnitudes of the transaction 
cost rates are of considerable importance if the resulting theory is to represent a sub¬ 
stantial improvement over the earlier Merton theory. The empirical evidence available 
[15] suggests, as mentioned in footnote 7, that the transactions costs must be given a 
much broader interpretation than narrowly defined brokerage fees. 

“ Since there are 2* ways of buying or selling k securities and since k securities can be 
chosen from m in (?) = (m\jk\(m — k)\) ways, there are (J‘)2* regions O k . The Binomial 
Theorem then implies (™)2* = 3 m . Since 3 m > 10 m(0 ' 4,7) this partition can involve 
an exceedingly large number of distinct regions even for a relatively small number of 
securities. For example 3 15 = 14,348,907. 

” When m = 3, the (m — k) dimensional hyperfaces S3 k n !)„ are just the 8 vertices, 
12 edges, and 6 faces of the rectangular solid. A transaction involving 3 securities ends 
at a vertex, a transaction involving 2 securities ends at an edge and a transaction 
involving 1 security ends at a face. 



PORTFOLIO SELECTION 


259 



Proposition 2. If Assumptions 1-7 are satisfied then there exists a 
region Q 0 C R m such that the investor always confines his portfolio proportions 
to this region. 28 

Proof. Consider the signal functions 

Xv) - - 1 ) - 11 f, + *,°(l + £ / - 1. rn. 

5 # 5 


The idea is to use these functions to define general ^-transaction regions 
JO* in which k securities are transacted. By calculating the regions Q m , 
-i and so on, the region S? a is arrived at recursively. In the proof that 
follows we assume implicitly that the regions do not overlap. When regions 
overlap however, which arises in particular when if < 0 or if > 1 
for some indices j, we proceed as in the proof of Proposition 1 and show 
that regions in which the ith security is both bought and sold must be 
regions in which the ith security is not transacted. 

Consider the regions Q m . Since v t * Sg 0 is equivalent to Vj* sg 0, the 
inequalities v t * g 0 ; = 1,..., m define 2 m regions Q m in which all m 
securities are transacted where Xv, — X 1 if v i* > 0, Xv, = ~Xt if v i* < 0. 
Next we obtain the regions . Suppose the first security is not trans¬ 
acted so that V!* =0. Let all the remaining securities have definite signs 


” The region can become unbounded if | f,° | is sufficiently large. In practice such 
cases are unlikely to arise. Suppose x* = xi — x- Then the hyperplanes v ( * = 0 intersect 
the f, axis at the points + xd - f, 0 ))) and = (f/Yfl-xd-f, 0 ))) 

and these points become unbounded as (f -* 1 + (1/x) and -*■ 1 — (1/x) res¬ 
pectively. 





260 


MAG1LL AND CONSTANTINTOES 


for v m *, say > 0, j = 2,..., m. Consider the region defined 

by Vj* > 0 j — 2,..., m with Xr, = X 1; intersect this with a similar region 
obtained by setting x % = —xi ■ Subtract out the regions where > 0, 

Vj* > 0 j — 2 ,..., m and v x * < 0 , v,* > 0 j = 2 . m (namely the two 

Q rn regions) and we are left with the region in which the first security is 
not transacted but the remaining securities are transacted in a definite 
way. Since there are 2 m_1 ways of buying and selling the remaining (m — 1 ) 
securities there are 2 m_1 such regions in which the first security is not 
transacted and since any of the (m — 1 ) securities can be chosen in place 
of the first security there are ml™- 1 regions involving transactions 
in (m — 1) securities. The recursive procedure should now be evident. 
Proceeding in this way we obtain all the different transactions regions in R m 

» Qfn —1 )•••) >•••> ^0 ■ 

By construction Q 0 is then the region to which the investor confines his 
portfolio. | 



Fig. 2. Transaction regions for m — 2 (general case). 


PORTFOLIO SELECTION 


261 


Figure 2 shows the 9 regions Q x , f 1 0 when m = 2, x* — x« =* X> 

4i° > 0 , / = 1 , 2 .** 

Theorem 1, (9), and (15) imply that the investor’s consumption policy 
becomes, as e -► 0 + , 


c*(0 = 40 + — e^ oir - t) (*o +1(1 + X„) * + «0 - <?«) 


This leads at once to 


where 


D - - 1 (p — r/r + ?)). 

I_ ’’ ( 20 ) 


Proposition 3. If Assumptions 1-7 are satisfied the investor's con¬ 
sumption policy depends (i) upon his current portfolio policy , (ii) upo/i Ais 
wealth and the length of his remaining lifespan.™ 

When Xi — 0, (20) coincides with Merton’s consumption policy [2, 3]; in 
this case as Samuelson observed [23], the consumption policy and the 
portfolio policy are independent financial decisions. When transactions 
costs are present, however, ( 20 ) implies that consumption varies depending 
on the region of the portfolio space in which f lies. The factor • 

adjusts his effective wealth in such a way that whenever the investor is 
purchasing [selling) a security, a factor is added to (subtracted from) his 
wealth, this nominal increase (reduction) in his wealth leading to an 
increase (reduction) in his consumption. The increased (decreased) con¬ 
sumption must however be drawn out of (kept in) cash which leads 
to a real reduction (increase) in his wealth thereby helping to increase 
(decrease) the proportion of the security in his portfolio, (ii) is immediate— 
though the fact that the consumption policy depends upon his remaining 
lifespan marks an important qualitative difference between the portfolio 
and the consumption policies. 

21 Using the recursive procedure of the proof we obtain the following transaction 
regions. Below ABC buy 1, buy 2: to left of CDE buy 1, sell 2: to right of EFG selll, 
sell 2: below GHA sell 1, buy 2. These are the 4 Sl a regions. In WCD buy 1: in DEF’ 
sell 2: in H'GH sell 1: in HAB' buy 2. These are the 4 flj regions. In BB'DF’FWHB' 
do not transact. This is the region Si 0 . 

25 Condition (16) ensures c* — c > 0 as required by (10) provided 

m 

*+£(1 + Xv t )si + Y — C > 0 . 

i-X 

The economic interpretation of (16) is familiar. For when ij < 0, u — co asc — 
c -*■ 0 + while when OCijCl, «-+0 as c — c-*0 + . Thus when 0 < t) <. 1 the pure 
rate of time preference p > 0 must be sufficiently large to ensure that consumption 
always exceeds the minimum level c. We may also note that if (16) is satisfied (IJ) and 
(20) are well defined as T -*■ to, so that the portfolio problem is well defined for the 
infinite horizon case. As T oo, c* converges to a time-independent consumption policy. 



262 


MAGILL AND CONSTANTINIDES 


4. Conclusion 

This paper has shown a number of fundamental qualitative changes that 
arise in the portfolio behavior of an investor when trading opportunities 
on the capital market are no longer available costlessly. The most basic 
change is that the investor substantially modifies his concept of an optimal 
portfolio which now consists of a whole region in the portfolio space. A 
direct consequence of this is that the investor only seeks to make use of 
trading opportunities at randomly spaced instants of time. Both of these 
properties are likely to hold more generally for the class of concave utility 
and transaction cost functions. The wider economic significance of trading 
costs must now be sought in their impact on the capital market as a whole. 
As one result in this direction trading costs can be shown to be an impor¬ 
tant factor explaining the existence of financial intermediaries such as 
mutual funds, as is shown in [24], The methods developed in this paper 
may also prove useful in determining the impact of trading costs on 
capital market equilibrium. 


References 

1. G. M. CONSTANTINIDES, “Transaction Costs in Portfolio and Cash Management,” 
D. B. A. dissertation, Indiana University, July, 1974. 

2. R. C. Merton, Lifetime portfolio selection under uncertainty: the continuous-time 
case. Rev. Econ. Statist. LI (1969), 247-257. 

3. R. C. Merton, Optimum consumption and portfolio rules in a continuous-time 
model, J. Econ. Theory 3 (1971), 373—413. 

4. R. C. Merton, An intertemporal capital asset pricing model, Econometrica 41 
(1973), 867-887. 

5. P. A. Samuelson, The fundamental approximation theorem of portfolio analysis 
in terms of means, variances, and higher moments. Rev. Econ. Studies 37 (1970). 
537-542. 

6 . 1.1. Gikhman and A. V. Skorokhod, “Introduction to the Theory of Random 
Processes,” Saunders, Philadelphia, 1969. 

7. N. H. Hakansson, Optimal investment and consumption strategies under risk for 
a class of utility functions, Econometrica 38 (1970), 587-607. 

8 . K. J. Arrow, T. E. Harris, and J. Marschak, Optimal inventory policy. Econo- 
metrica 19 (1951), 250-272. 

9. R. Bellman, I. Glicksbero, and O. Gross, On the optimal inventory equation. 
Man. Science 2 (1955), 83-104. 

10. W. J. Baumol, The transactions demand for cash: an inventory theoretic approach, 
Quart. J. Econ. 66 (1952), 545-556. 

11. T. M. Whttin, “The Theory of Inventory Management," Princeton Univ. Press, 
Princeton, N. J., 1953. 

12. M. H. Miller and D. Orr, A model of the demand for money by firms, Quart. 
J. Econ. 80 (1966), 413-435. 



PORTFOLIO SELECTION 


263 


13. G. D. Eppen and E. F. Fama, Cash balance and simple dynamic portfolio problems 
with proportional costs, Intemat. Econ. Rev. 10 (1969), 119-133. 

14. E. Zabel, Consumer choice, portfolio decisions, and transactions costs, Econo- 
metrica 41 (1973), 321-335. 

15. H. Demsetz, The cost of transacting, Quart. J. Econ. 82 (1968), 33-53. 

16. K. H. Borch, “The Economics of Uncertainty,” Princeton Univ. Press, Princeton, 
N. J., 1968. 

17. H. J. Kushner, “Stochastic Stability and Control," Academic Press, New York, 
1967. 

18. K. J. Arrow, “Essays in the Theory of Risk-Bearing,” Markham, Chicago, 1971. 

19. J. W. Pratt, Risk aversion in the small and in the large, Econometrica 32 (1964), 
122-136. 

20. D. Cass and J. E. Stiglitz, The structure of investor preferences and asset returns, 
and separability in portfolio allocation: a contribution to the pure theory of mutual 
funds, J. Econ. Theory 2 (1970), 122-160. 

21. R. C. Merton, Erratum, J. Econ. Theory 6 (1973), 213-214. 

22. V. S. Pugachev, “Theory of Random Functions and Its Application to Control 
Problems,” Perga man Press, Oxford, England, 1965. 

23. P. A. Samuelson, Lifetime portfolio selection by dynamic stochastic programming, 
Rev. Econ. Statist. LI (1968), 239-246. 

24. M. J. P. Magill, The preferability of investment through a mutual fund, J. Econ. 
Theory, 13 (1976), 264-271. 


6 43/i 3/2-7 



JOURNAL OF ECONOMIC THEORY 13, 264-271 <1976) 


The Preferability of Investment Through a Mutual Fund* 

Michael J. P. Magill 

Department of Economics, Indiana University, Bloomington, Indiana 47401 
Received October 14, 1974; revised April 2, 1976 


1. Introduction 

One of the principal results of the theory of investor portfolio selection 
is the Tobin Cass-Stiglitz Mutual Fund Theorem [1, 2], The simplest 
version of the theorem asserts that in an economy with a riskless asset 
(money) and m risky assets, a mutual fund can be formed such that every 
individual is indifferent between investing in the mutual fund or directly 
purchasing the individual assets. 

In a recent important contribution Merton [3] has shown that the 
theorem can be extended to the continuous time framework when the m 
risky assets are joint lognormally distributed. The theorem, however, 
points to an important defect of the associated capital market theory; 
for in such a framework financial intermediaries such as mutual funds 
have no real reason to exist: every investor can achieve on his own the 
services offered by the mutual fund. 

The validity of this result depends on three basic assumptions: (i) the 
absence of transactions costs, (ii) the perfect divisibility of each security, 
so that any proportion of a security can be transacted, 1 and (iii) the 
availability of perfect, costless information. When any of these assumptions 
is dropped it would seem that an individual might prefer investment 
through a mutual fund. 

The object of this paper is to show that when the first of these basic 
assumptions is dropped so that transactions costs are introduced explicitly 
a mutual fund can be formed such that individual investors prefer 
investment through the mutual fund to individual investment on the 
capital market. A preliminary step is thus made towards a formal theory 

* I am grateful to Robert C. Merton for helpful discussions. Of course a standard 
caveat. An earlier version of this paper was presented at the Third World Congress 
of the Econometric Society, Toronto, August, 1975. 

1 Klein [4] has suggested that (ii) follows from (i) since in a world of zero transactions 
costs a corporation maximizes its value when it issues perfectly divisible securities. 

264 

Copyright © 1976 by Ansdemic Press, Inc. 

All rights of reproduction in any form reserved. 



MUTUAL FUND THEOREM 


265 


of financial intermediaries within the standard theory of the capital market. 

The paper draws on the analysis in [5] in which it was shown how an , 
individual investing on his own in the capital market adjusts his portfolio 
behavior in the presence of proportional transactions costs. In Section 2 
the assumptions concerning the capital market and the individual investor 
are briefly summarized, while Section 3 constructs the basic Mutual Fund 
and establishes the preferability of investment through the Mutual Fund. 


2. Assumptions Concerning the Capital Market and the Investor 


In [5] seven important assumptions were made concerning the capital 
market and the individual investor. The reader is referred to [5] for an 
exact statement of the assumptions which may be summarized as follows. 
The capital market consists of continuous competitive markets for m risky 
securities each of which is perfectly divisible. The prices of the securities 
are lognormally distributed with instantaneous mean and covariance 
matrix (a, U) where a — (a,,..., a m ) and 


27 = 


°il °lm' 


is positive definite and all information regarding the securities is perfect, 
continuously available and costless. Every investor can borrow or lend 
an unlimited amount at a constant interest rate r > 0 and expects a known 
contractual income stream j(r) over his known lifespan [0, T], If v, denotes 
the value of the ith security purchased (r, > 0) or sold (v { < 0) per unit 
of time then the transaction cost function T(v x ,..., v m ) indicating the cost 
of buying or selling any combination of the m securities is given by 


T(l'l V m ) ^ 


where 



Vi > 0 , 

v t < 0 


and where 0 < X* < 1.0 < X< <l,i = 1 ,..., m so that transactions costs 
are proportional to the value of each transaction. It was shown that if Si(t) 
denotes the value of the investor’s holdings of the ith security at time t then 


ds t (t) = [«,s<(0 + dt + s t (t ) dZi(t ) i=l. m, (2) 

where dz(t) = (dz^t),..., dz m (t )) is the formal increment of a Brownian 
motion process 


E(dz) = 0. E(dz dz") = 2 dt. 



266 


MICHAEL J. P. MAGILL 


while the investor’s stock of bank deposits (cash) s 0 (t) satisfies 

ds 0 (t) = jry 0 (t) + y(0 - c(t) — £ (1 + *„,) t\(oj (3) 

c(t) denoting the flow of consumption expenditure at time t. 

Assumptions concerning the investor’s preferences were made so that 
his objective was to choose a transaction-consumption policy (v, c) = 
(v t v m , c) which would maximize 


Efol C u{c,r) dr, (4) 

*'0 

where Ef^'ol denotes the conditional expectation given the transaction- 
consumption policy ( v, c) over the time interval [0, T] and given that his 
initial stock of securities is s = (s 0 s m ) at t = 0, subject to (2) and (3). 
The utility function was furthermore assumed to belong to the following 
family characterizing an investor with decreasing absolute risk aversion 


u(c, r) = e -^iL_2) (_j£_ + y( T )f 

Rv 

— e~‘‘ T (\ — — (c — £(t))", c ^ 


%r). 


£(t) = —y(r) (l , — 00 < 1 J < 1, P > 0, 

— oo < y(r) <oo, p 5? 0. 

Introducing the effective wealth of the investor 


(5) 


»<o = i m + nt ) - cv) 

(=0 

where 

Y(t) = f T y(r) e~ r "-» dr, C(t) = C c(r) e-'""" dr 

and the portfolio proportions $ t = sjw, i = 0. m, £ = (f t £ m ), 

= (F — £)/w so that Y.T -0 + £v — 1 was shown that the half¬ 

spaces defined by 

X,) = lxv,&° - I) - II6 + 6° (i + £ *,,&) g 0 

j m ( 6 ) 



MUTUAL FUND THEOREM 


267 


lead recursively to a zero-transaction region Q„ about the optimal portfolio 
proportions in the absence of transactions costs 

f° = ^[(« - rn)/(l - V )], n = (1.1) (7) 

with the property that whenever £ e Si 0 it is optimal not to transact but as 
soon as £ $ Q 0 it is optimal to transact so as to return £ to the boundary 
of Q 0 . An investor pursuing an optimal policy of individual investment 
on the capital market thus obtains his best results when he confines his 
portfolio to the region Q a . When m = 2 the region was shown to be 
the shaded region in Fig. 1. 



3. A Modified Mutual Fund Theorem 

Consider the following idealized Mutual Fund. Let F x F m denote the 
value of its holdings of each of the m risky securities and let F — F t , 
A, — Fj/F j = 1,..., m, A = (AjA,„). The portfolio proportions are 
chosen so as to satisfy A* = /u27 _1 (a — rn), p > 0 A *'n = 1 so that 
provided (a — rn)' E~ l n =£ 0 


A* = — rri)j(a. — rn)' E~ x n. 


( 8 ) 



268 


MICHAEL J. P. MAGILL 


Unlike individuals this centrally administered Mutual Fund is not involved 
in transactions costs when altering its portfolio. 2 This implies 

dF(t) = dt + F(t) A*' dz(t). (9) 

Let N(t) denote the number of its shares outstanding, each share being 
perfectly divisible and let P(t) denote the price per share. F(t) changes 
continuously according to (9) except at certain instants when an investor 
either deposits or withdraws funds: then F(t) increases (decreases) 
discontinuously but in such a way that P(t) — F(t)/N(t ) is unchanged. 
Thus N(t ) is unchanged except at discrete points when it alters in such a 
way that F(t)/N(t ) is unchanged. With this rule for issuing shares, Ito’s 
Lemma [6, pp. 386-391] immediately implies 

dP(t) = a'A*P(0 dt + P(0 A*' dz{t). (10) 

Consider an investor faced with the opportunity of investment through 
this Mutual Fund. Let X(t) denote the number of shares and S(t ) — 
X(t) P(t) the value of his Mutual Fund holdings. Then Ito’s Lemma and 
(10) imply 

dS{t) = («S(t) + v(t)) dt + S(t)dz(t), (11) 

where a =■ a'A*, dz(t) = A*' dz{t), and v(t) = {dX(t)jdt) P(t) denotes the 
transaction rate. If j 0 (f) denotes the investor’s stock of cash and if x f 
denotes the transaction cost rate for the Mutual Fund’s shares, 0 < < 1 

then (3) becomes 

ds 0 (t) = [rs 0 (t) + y(t) — c(t') — (1 -f x/) v (0] dt (12) 

where 

v f = \ if v > 0, 

Xv \-x F if r < 0. 

Thus when investing through the Mutual Fund the individual's investment 
problem reduces to choosing (v, c), so as to maximize (4) with s — (.?„ , S) 
subject to (11), (12) and the initial condition (.y o (0), 5(0)). 

“This is clearly an idealized assumption. It may be interpreted either as a purely 
formal assumption which simplifies the construction of the Mutual Fund or as an 
attempt to state in extreme form the fact that transactions cost rates for a typical 
mutual fund are significantly smaller than those for a typical individual investor due 
to economies of scale in transactions. If the latter interpretation is used, note, however, 
that no attempt is made to develop a formal theory explaining the behavior of a typical 
mutual fund. 



MUTUAL FUND THEOREM 


269 


The method of analysis developed in [5] can now be applied to this 
alternative investment problem. The investor’s effective wealth becomes < 
w(t) = s 0 (t) + S(t) + Y(t) — C(t). If we let 8 = S/w then (6) leads to the 
zero-transaction region for 8 

jtf e R | 8 = f-j-_ go) + j ~ ^ » 0 < < lj, (13) 

8° being given by (7) which in this case reduces to 
0° = (« - r)/o*( 1 - r,), 

where a = a.'E ~\a — rn)/(oc — rn)' S~ x n and <f* dt = E(dz )* = 
((a — rri)' £~ x (a. — rn)/[(ot — rn)' Z~ x nY) dt so that 


8° = (<* - rn)' £~ l n/(l - r,) = £<>'«. 


Since the Mutual Fund’s portfolio always satisfies (8), when the individual 
invests a proportion 9 of his effective wealth in the Mutual Fund he in 
effect holds a portfolio £ with two properties; £ always lies along the ray 
through £° (by (7) and (8)) and £'n — 8. Since the hyperplanes 

0 ° 8 ° 
n = TT?(i' - 8°) and * " = i - x'(i - 0°) 

cut the ray passing through £° at the points 

i + x F a~~l v "J and t - xV - e'n) • 

the zero transaction region (13) for 8 translates into the following region in 
the portfolio space: 



i + x v - r«) 


+ 


(l - f) £° 

1 - X F 0 - ?'»)' 


< 5 , 



(14) 

Qf is thus the zero-transaction region for the individual when he invests 
through the Mutual Fund. 


Theorem. If the investors and the capital market satisfy Assumptions 
1-7, 8 if x* = x< = X > 0, / = m and E~\ot — rri) has more than one 


* The numbering refers to the statement of the Assumptions in [5], The content of 
these Assumptions is summarized in Section 2 above. 



270 


MICHAEL J. P. MAGIIX 


nonzero component then there exists a Mutual Fund such that whenever 
y F < x a U investors independent of their preferences, age, income, or 
financial assets prefer investment through the Mutual Fund to individual 
investment through the capital market* 

Proof. Since the investor’s preferences can be represented by (4) and (5) 
(Assumptions 6 and 7) and since Assumptions 1-5 are also satisfied the 
investor’s two portfolio problems, the first involving individual investment 
through the capital market and the second involving investment through 
the Mutual Fund satisfying (8) and (10) are both well defined. In particular 
the regions Q 0 and £2 0 F are well defined. 

Let p(£) = (a — rn)' £ + r(l — £ y ) and a 2 (£) — £'££ denote the 
instantaneous mean return and instantaneous variance of the portfolio £. 
Since (5) implies that each investor is risk averse, each investor prefers a 
portfolio which, for given p(£), has a smaller o 2 (£), and for given o 2 (£) 
has a greater p(£). A portfolio which for given p(£) minimizes o 2 (£) or 
for given o 2 (£) maximizes p(£) is called efficient. It is evident that a 
portfolio £ is efficient if and only if £ = S£° for some 8 > 0. 

Suppose x F = X■ Recall from [5, Eq. (17)] that when v } * > 0 ,j = 1,..., m 
since x { = X* — X> 1 = l,- -, m the hyperplanes defined by (6) reduce to 

ft[l + X(1 - if)] - 6°X (S ft) = 6° J = 1,-. m 

t+i 

since v,* > 0 implies g v = X- R is easy to see that these hyperplanes 
intersect at the point 

f°/l + X (l “ f fi°). (15) 

Similarly when v t * < 0, j = ],..., m so that x», — —X> the hypetplanes 
defined by (6) reduce to 

6U ~ xO - t,*)] + £fx (t ft) = f,° J = 1,..., in. 

i+i 

* Although the theorem can be viewed as a preliminary step toward a simple rational 
explanation for the existence of mutual funds within the standard capital market 
theory, it should be remembered that a formal theory explaining the behavior of mutual 
funds is still lacking, so that there is no assurance as yet that there will exist rational 
mutual funds whose behavior approximates that of the idealized Mutual Fund. 

As in the proof of [5, Propositions 1 and 2] we need to make an assumption concern¬ 
ing the initial portfolio proportions, namely that f(0) e O 0 and 0(0) e Q 0 r , since it appears 
that there are conditions under which it is not optimal to transact into £2„ or ilf. See 
[5, footnote 17]. 



MUTUAL FUND THEOREM 


271 


It is evident that these hyperplanes intersect at the point 

*°/l -x(l~ I ' 06) 

so that (15) and (16) are the boundary points of Q 0 which lie on the ray 
passing through i°. But then (14) implies that Q 0 F is exactly the set of 
efficient portfolios in (the segment BFm Fig. 1). Since x* — X* — X > 0 
and since |° has at least two nonzero components, (6) implies that & 0 
contains many inefficient portfolios. A process confined to Q 0 F is thus 
clearly preferred to a process in Q 0 . Suppose x F < X■ Since £3 0 F reduces 
to a smaller segment of the efficient portfolios about £° the result is 
immediate. A 

The economic interpretation of the theorem is straightforward. Each 
investor, in determining his portfolio faces two problems: the problem of 
the composition of his portfolio of risky assets and the problem of the 
amount to be invested in the risky assets. When the individual invests on 
his own through the capital market the presence of transactions costs 
makes the control of both composition and amount a costly procedure. 
Since all investors would like the same mix of risky securities (a mixture 
which depends only on the security price parameters (27, <x, r)) it is feasible 
to establish a single Mutual Fund which solves the composition problem for 
all investors costlessly. Provided x F ^ X > 0 the investor prefers 
investment through the Mutual Fund since he now only has to bear the 
costs of adjusting the amount invested in risky assets. Indeed if x F — 0 
the investor is able to achieve through the Mutual Fund what he could 
otherwise only achieve individually on the capital market if there were no 
transactions costs. 


References 

1. J. Tobin, Liquidity preference as behaviour towards risk, Rev. Econ. Studies 26 (1958), 
65-86. 

2. D. Cass and J. E. Stigutz, The structure of investor preferences and asset returns, 
and separability in portfolio allocation: A contribution to the pure theory of mutual 
funds, J. Econ. Theory 2 (1970), 122-160. 

3. R. C. Merton, Optimum consumption and portfolio rules in a continuous time 
model, J. Econ. Theory 3 (1971), 373-413. 

4. M. A. Klein, The economics of security divisibility and financial intermediation, 
J. Finance 28 (1973), 923-931. 

5. M. J. P. Maoill and G. M. Constantinides, Portfolio selection with transactions 
costs, J. Econ. Theory 13 (1976), 245-263. 

6. I. I. Gikhman and A. V. Skorokhod, “Introduction to the Theory of Random 
Processes,” Saunders, Philadelphia, 1969. 



JOURNAL OF ECONOMIC THEORY 13, 272-297 (1976) 


Habit Formation and Long-Run Utility Functions* 

Robert A. Pollak 

Department of Economics, University of Pennsylvania, Philadelphia, Pennsylvania 19174 

Received March 5, 1973 


This paper extends the work on habit formation of Poliak [Habit formation 
and dynamic demand functions. J. Polit. Econ. (1970)] and provides a critical 
counterexample to a conjecture of von Weizsacker [Notes on endogenous 
changes of tastes, J. Econ. Theory (1971)] concerning the existence of a “long- 
run utility function." A linear specification of habit formation is applied to a 
general system of demand functions with linear Engel curves. It is shown that 
there exists a utility function which rationalizes the long-run demand functions 
if and only if they are the steady-state solution to a system of short-run demand 
functions generated by an additive utility function. 


1. Introduction 

This paper extends the earlier work on habit formation of Poliak [6] and 
provides a critical counterexample to a conjecture of von Weizsacker [9] 
concerning the existence of “long-run utility functions.” 

In [6], I proposed a model of consumer behavior based on habit forma¬ 
tion, using a specific class of demand functions. I postulated that some of 
its parameters depend linearly on past consumption and examined the 
resulting system of short-run demand functions. From these I Jbund the 
implied system of long-run demand functions. Although they were 
defined as the steady-state solution of the system of short-run demand 
functions, I showed that, in each of the cases I considered, the long-run 
demand functions could be rationalized by a utility function. This “long- 
run utility function” is of the same general form as the short-run utility 
function, although its parameters depend on both the parameters of the 
short-run utility function and those of the habit formation specification. 

All of the short-run utility functions used in Poliak [6] are additive and 
generate systems of demand functions with linear Engel curves. In [7], 

* This research was supported in part by grants from the National Science Founda¬ 
tion. My discussion of the evaluation of welfare when tastes are subject to habit forma¬ 
tion owes much to Edwin Burmeister and Stephen A. Ross, to whom I am grateful. 

272 

Copyright © 1976 by Academic Press, Inc. 

All righta of reproduction In any form reserved- 



HABIT AND UTILITY FUNCTIONS 


273 


I show that the forms used in Poliak [6] are the only ones exhibiting both 
of these properties. 

The class of utility functions which generates demand functions with 
linear Engel curves is completely characterized by two functions homo¬ 
geneous of degree one. When the utility function is also additive, these 
two functions each assume highly specific n-parameter forms, where n is 
the number of goods. The restriction to a class characterized by two n- 
parameter homogeneous functions of a specific form is obviously a severe 
one and indicates that only a narrow subclass of the utility functions 
which generate linear Engel curves are additive. 

In this paper I apply the linear specification of habit formation used in 
Poliak [6] to general systems of demand functions with linear Engel curves. 
As in Poliak [6], I examine the short-run demand functions and the 
implied system of long-run demand functions. I show that there exists a 
utility function which rationalizes these long-run demand functions if and 
only if they are the steady-state solution to a system of short-run demand 
functions generated by an additive short-run utility function. That is, of 
the short-run utility functions which yield systems of demand functions 
with linear Engel curves, the narrow class of additive ones used as examples 
in Poliak [6] are the only ones which, under linear habit formation, yield 
long-run demand functions which can be rationalized by utility functions. 

It is not surprising that only a narrow class of short-run utility functions 
generate long-run demand functions which can be rationalized by a 
utility function. Gorman [2] examines a continuous-time model of habit 
formation and shows that the implied long-run demand functions satisfy 
the Slutsky symmetry conditions only in very special cases. Since Gorman 
works in continuous time, his results are not directly comparable to the 
discrete time formulations von Weizsacker and I consider, but one would 
not expect the discrete and continuous cases to be fundamentally different. 

Yon Weizsacker [9] is primarily concerned with the welfare rather than 
the positive implications of habit formation. However, his entire discussion 
rests on his claim that the long-run demand functions can be rationalized 
by a utility function. In this paper, I show that his claim is not valid, except 
in some very special cases. 

Von Weizsacker begins with a system of short-run demand functions 
of the form 


<ht — h U {p lt ,p 2i , (i t , <?]<_! , ^ 

<72t — h 2t (Pit , Pn , pt, 9u-i > 9it- 1)» 

where p’s, q' s, and p’s denote prices, quantities, and total expenditure, 
hereafter referred to as income. Von Weizsacker places no restrictions on 



274 


ROBERT A. POLLAK 


the form of the short-run demand functions or on the way the previous 
period’s consumption influences them, but he does restrict his analysis by 
assuming that there are only two goods: 

For simplicity, I shall assume that there are only two goods. Also for simplicity, 

I shall assume that tastes are only influenced by the consumption vector of the 
last period. Influences from periods before the last one are neglected. Although 
the mathematics would become more complicated, I presume that the relaxation 
of these two assumptions would not change the substance of the argument. 

Unfortunately, where integrability conditions are concerned, the 
assumption that there are only two goods does change the substance of 
the argument. In particular, the Slutsky symmetry conditions are always 
satisfied in the two-good case, provided the demand functions satisfy the 
budget constraint and are homogeneous of degree zero in prices and 
income. Hence, there is no presumption that either an argument or a 
result which holds in the two-good case will generalize to the n-good case, 
where n > 2. In fact, neither von Weizsacker’s argument concerning the 
existence of a long-run utility function nor his result generalize; for n > 2 
the long-run demand functions can be rationalized by a utility function 
only in certain exceptional cases. Viewed from the perspective of 
von Weizsacker [9], this paper demonstrates that the principal results of 
that paper do not generalize beyond the two-good case. It does this by 
means of what I have called a “critical counterexample.” Of course, any 
counterexample is sufficient to refute a conjecture, but there are many 
instances in which a counterexample does not go to the heart of the 
conjecture and hence leaves open the question of the validity of its principal 
claim. For example, suppose I conjecture that any real-valued function 
defined on a closed interval has a zero derivative at its maximum, and you 
offer, as a counterexample, a function with a boundary maximum. I would 
respond by adding a clause to my conjecture, restricting it to interior 
maxima. In this case, the counterexample leads to a refinement of the 
original conjecture; its scope is restricted, but the basic insight expressed 
by the original conjecture remains intact. By a critical counterexample 
I mean one which challenges the central contention of the conjecture. Like 
any counterexample, it demonstrates that the conjecture is incorrect as it 
stands. But because it challenges the basic perception of the conjecture, it 
also shows that the original conjecture cannot be “saved” by adding extra 
clauses to rule out “pathological” cases. This is not because a valid 
conjecture cannot be obtained by restricting its scope; usually, a true 
conjecture pan be obtained in this way. The difficulty is that the restricted 
conjecture would be such an attenuated version of the original that it 
cannot fairly be said to embody its principal contention. The essential 



HABIT AND UTILITY FUNCTIONS 


275 


feature of a critical counterexample is that it forces abandonment of the 
central assertion of the original conjecture. 1 2 

It might be argued that my results constitute a critical counterexample 
to the von Weizsficker conjecture only for the class of utility functions 
which generate linear Engel curves. It is conceivable that, from the stand¬ 
point of the von Weizsacker conjecture, linear Engel curves or linear habit 
formation are “pathological” cases and that the conjecture is generally 
valid when Engel curves and habit formation are nonlinear. I find this 
implausible, since there is no apparent reason why these cases should be 
exceptional. They were chosen as the focus of this investigation, not 
because it appeared more likely that the conjecture would fail in these 
cases than in others, but because they are relatively tractable. 

The plan of this paper is as follows: In Section 2, I summarize the 
relevant results on additive utility functions and linear Engel curves from 
Poliak [7], In Section 3, I describe the specification of habit formation and 
investigate its implications for general systems of demand functions with 
linear Engel curves. In Section 4, I state and prove the central theorem of 
this paper: Suppose that the short-run demand functions have linear 
Engel curves, that tastes are subject to linear habit formation, and that the 
number of goods exceeds two; then the long-run demand functions can be 
rationalized by a utility function if and only if the short-run utility function 
is additive. In Section 5, I discuss some of the welfare issues raised by 
von Weizsacker and argue that, even when the long-run utility function 
exists, it has no welfare significance. Section 6 is a brief summary. 


2. Additive Utility Functions and Linear Engel Curves 

In this section I summarize the results on additive utility functions and 
linear Engel curves obtained in Poliak [7]. A more thorough exposition, 
proofs, and references to the literature may be found there. 

Definition. Let U(Q) be a utility function, where Q denotes the 
commodity vector (g, ,..., q n )} 

1 For a fascinating discussion of the role of counterexamples in mathematics, see 
Lakatos [4], The term “critical counterexample” is my own. 

2 We consider only utility functions which are defined in a subset, R, of the commod¬ 
ity space and satisfy the following regularity conditions: 

(i) the set of ail Q e R for which U(Q) > D is strictly convex for all V. 

(ii) U has strictly positive first-order partial derivatives everywhere in R. 

(iii) V has continuous second- and third-order partial derivatives everywhere in R. 



27 6 


ROBERT A. POLIAK 


If there exist n functions, and a thrice differentiable function F, 

F' > 0, such that 

F[um = t «*(?*)> (2-0 

k-1 

then we say that the utility function U is additive . a 

Definition. Let {h\P, h n (P, fi)} denote a system of demand 
functions, where P denotes the price vector (/>jp„) and p. denotes total 
expenditure or “income.” 4 If the demand functions are of the form 

/,<(/>, = X i(P) + y.(/V (2.2) 

in some region of the price-income space, we say that the demand func¬ 
tions are locally linear in income or, briefly, linear. If 

h‘(P,p) = y<(P)p, 

we say that the demand functions exhibit expenditure proportionality. 
Expenditure proportionality is a special case of linearity. 

The income-consumption curves are straight lines in a region of the 
commodity space if and only if the demand functions are linear; they are 
straight lines radiating from the origin if and only if the demand functions 
exhibit expenditure proportionality. Houthakker [3] has pointed out that, 
if the income-consumption curves are linear almost everywhere, then they 
must either go through the origin or, when account is taken of the non¬ 
negativity constraints, they must be broken curves with linear segments. 
A kink occurs when an increase in income causes the individual to consume 
a good which he had not previously consumed. This suggests that the 
linearity hypothesis must be applied with caution, but it is not a valid 
reason for rejecting it. 

In [7] I prove the following. 

* We require that additive utility functions satisfy the additional regularity condition 
u ‘'( q ,) < 0, where u‘~(q t ) denotes the second derivative of u‘(gd- This assumption of 
‘‘diminishing marginal utility” is more restrictive than the convexity requirement of 
footnote 2. If V is an additive utility function and u ( '(q,) > 0 = 0), then an 

increase in income will cause an increase in the consumption of the /'th good and a 
decrease (no change) in the consumption of every other good. Since we are concerned 
with situations in which demand functions are linear in income, there is no serious Joss 
of generality in ruling out this perverse case. 

4 We consider only systems of demand functions which can be generated by utility 
functions satisfying the regularity conditions of footnote 2. 



HABIT AND UTILITY FUNCTIONS 


277 


Theorem. An additive utility Junction yields demand functions locally 

linear in income if and only if it is of one of the three following forms: 

/ 

U(Q) = £ a k log(< 7 * - b k ), a< > 0, ( 9 , - ft) > 0, £ a k = 1, (2.3) 

*-l 

U(Q) = X 0£*(ft. + W, (2.4) 

k 


V(Q) = -I >0, ft > O.s (2.5a) 

A:~l 

The utility function (2.3) represents the familiar Klein-Rubin-Stone- 
Geary linear expenditure system; the implied demand functions are given 
by 

h\P, + £ 11 . (2.6) 

Pi ]c Pi 

Any admissible utility function of the form (2.4) can be written in one 
of the following three forms: 


U(Q) = 


n 

-£ a *(<7* - ft) c , 


C < 0, a, > 0, (q t - ft) > 0, 


(2.7) 


71 


U{Q) = I **(?* - ft) c , 

*-1 


0 < c < 1, a,-> 0, (?< — ft)>0, (2.8) 


7* 


tf(0) = -I «*(ft - q*Y, 

fc-1 


c>l, a { > 0, (ft —< 7 ,)> 0 . 


(2.9) 


(2.7) and (2.8) are translations of the members of the “Bergson family” 
of utility functions, whose indifference maps correspond to the isoquant 
maps of the CES class of production functions, while (2.9) includes the 
additive quadratic. The demand functions corresponding to (2.7)-(2.9) are 
given by 


n 


h*(P, fj.) = ft - y<(P) Y. b kPk + yJP) p, 

jt-i 


where yJP) is given by 

y<(P) = 


Wgg! 

I Pk(PklOk) 1/(c ~ v ’ 


( 2 . 10 ) 

( 2 . 11 ) 


s The fixed coefficient form is also discussed in Poliak [7], but it is excluded by the 
regularity conditions of footnote 2. 



278 


ROBERT A. POULAK 


The income-consumption curves corresponding to (2.3), (2.7), and (2.8) 
radiate upward from the point (hj,..., b„), and in these cases the b's are 
often interpreted as necessary or subsistence quantities. Such an inter¬ 
pretation is often useful, but it should be taken figuratively rather than 
literally, especially since negative b's are admissible. The income-consump¬ 
tion curve corresponding to (2.9) radiates downward from (6 X ,..., b n ), and 
in this case the b’s are interpreted as satiation or bliss points. 

The utility function (2.5a) can be regarded as a limiting form of (2.4) 
since 

lim - £ a k (l + g k ) = ~X oL k e~ 8lfi \ 

e — T v c ’ T 

It is convenient to rewrite (2.5a) as 


U(Q) - -X a k e^ ),a \ a ( > 0, 


(2.5b) 


where a k — l//3 ; . and b k = (1 jfi k ) log a. k fl k . The corresponding demand 
functions are 

/..YD _ U ,_, 

ii (P, P) = b, -yr--~- r ^r—— - ai log Pt H- y " n n . ' — 1 

fk PkPk 2~k PkPk 2-kPkOk 

( 2 . 12 ) 

Gorman [1] has shown that, if an individual’s demand functions are 
linear, then his indirect utility function can be written in the form 


l(l{P, p.) = - 


f{P) 


g{P) g(P) ’ 


(2.13) 


where f{P) and g{P) are homogeneous of degree one. Because the indirect 
utility function is ordinal, the phrase “can be written” must be inter¬ 
preted carefully. Formally, 


Theorem {Gorman). If an individual’s demand functions are linear in 
income and his preferences can be represented by an indirect utility function, 
8{P, p), then there exists a function G, G’ > 0, and functions f(P) and 
g(P), homogeneous of degree one, such that 


G[0(P, M )] - 


f(P) 

g(P)' 


(2.14) 


As Gorman shows, the demand functions corresponding to (2.14) are 
given by , 


h l (P,p)=fi-lf + jP, 


( 2 . 15 ) 



HABIT AND UTILITY FUNCTIONS 


279 


where /<(P) and g { (P) denote the partial derivatives with respect to p t of / 
and g, respectively.* The “Gorman forms” of the indirect utility functions, 
corresponding to (2.3)-(2.5) are given by 

g{P) = rrp? and /(/>) = £ b kPl , (2.16) 

r ii/c 

^P) = [X and /(P) = £ , (2.17) 

g(P) = X <**/>!= and /(/*) = X + (E a kPk) log (x «*/>*) 

— X a *P* lo 8 P* » (2.18) 

respectively. 


3. Linear Habit Formation 

In this section 1 describe the specification of linear habit formation and 
investigate its implications for systems of demand functions locally linear 
in income. A more detailed exposition of the linear habit formation model 
applied to a specific class of demand functions, a proof that the implied 
system of dynamic demand functions is stable, and references to the 
literature may be found in Poliak [6], 

Following Poliak [6], we introduce habit formation into the linear 
expenditure system (2.3) by postulating that the b's depend linearly on past 
consumption. More specifically, we assume that the “necessary” or 
“subsistence” quantity of good i in period t depends linearly on consump¬ 
tion of that good in the previous period: 

i 

bu = b<* + P<q«-i , 0 < ft < 1(3.1) 

Here b ( * can be interpreted as a “physiologically necessary” component 
of b it and faqu-i as the “psychologically necessary” component. 

If all goods are subject to habit formation of the type described by (3.1), 
then the demand functions are given by 

h u {P, fi, Q*- 1 ) = b ( * - ^ X/>A* + p + fan -1 - ~ X Pk^qu-i, 

Pt Pt Pt 

(3.2) 

• It should be remarked that the function f[P) is not determined uniquely; nothing 
is changed when we replace f(P) by / *(F) = f{P ) + <o g(P). 

1 The requirement p, < 1 is a stability condition. 


64 */i 3/2.8 



280 


ROBERT A. POLLAK 


where time subscripts on the p’s and p have been suppressed. These short- 
run demand functions, like their static counterparts, are locally linear in 
income. Since the b’s are linear in past consumption and since current 
consumption depends linearly on the b’s, present consumption of each 
good is a linear function of past consumption of all goods. Since the j9’s 
are positive, there is a positive relation between past and current consump¬ 
tion of each good and a negative relation between past consumption of a 
good and current consumption of every other good. 

In this paper I shall consider only the habit formation specification (3.1), 
which implies that consumption in the previous period influences current 
preferences and demand but that consumption in the more distant past 
does not. A more general specification in which the b’s depend linearly on 
a geometrically weighted average of past consumption is considered in 
Poliak [6], and all of the results obtained in this paper can be extended 
without difficulty to the more general case. 8 

This specification of “linear habit formation” can be applied to any 
system of demand functions locally linear in income. We modify the 
Gorman form of the indirect utility function (2.13) by replacing/(P) by 
f(P) + £ p k b k ; the indirect utility function now becomes 

np, p) = ** . (3 .3) 

The corresponding demand functions are given by 

h\P, p) = b t - f £ Pk b k +f i -&f+ilp (3.4) 

6 o 6 

and are linear in income and in the b’s. We now proceed as we did in the 
case of the linear expenditure system, by postulating that the b’s depend 
linearly on past consumption. This yields a system of short-run demand 
functions of the form 

h“(P l , p t , C'- 1 ) = b t * - +ft ~ ff+ f M 

+ ~ X PkfikQkt -1 • (3.5) 

Given the consumption vector of period zero and given prices and 

B McCarthy [5] shows that a more general system of this type is stable even when 
difFereflf ijoods have different “memory” coefficients, which is a result I was unable to 
establish in Poliak [6], 



HABIT AND UTILITY FUNCTIONS 


281 


income of period one, the short-run demand functions yield a consumption 
vector for period one. In a “steady state” or “long-run equilibrium” the, 
optimal consumption vector for period one will b6 identical with the 
consumption vector of period zero. And, if prices and income remain 
constant over time, the optimal consumption vector in every subsequent 
period will also be equal to the consumption vector of period zero. 

The long-run equilibrium consumption vector can be found by solving 
the short-run demand functions (3.S) under the assumption that 
q it = q (t -i — q t for all i. To save notation, we replace gjg by y‘. We 
solve (3.5) for q { as a function of the p’s, p, and 2 Pkfikdk ■ 


where 


<li - 


b t * ±fj 

1 -Pi 



o = p —2>A* — ■ 

Multiplying (3.6) by p ( and summing over i yields 


(3.6) 


so that 


I*-I M.-I ft (-*£?£-)+ •£ 


Pk7* 

1 -ft’ 


... +/*)/(!-ft) ] 

I [Pk7 k /( 1 - ft) 1 


Substituting for a in (3.6), we obtain the long-run demand functions. 
This proves 


Theorem. Suppose that the short-run demandfunctions are locally linear 
in income (3.4), 

qu = b tt +UP t ) + [p, -f(P t ) - £/>**«], 

and b it is given by the linear habit function (3.1), 

bit = ft* + P #"-!. 


Then the long-run demand functions are given by 


where 


h\p, p) = B<(P) - r‘(p)£ Pk B k (P) + r\p)p. 


B‘(P) = 


b ( * + f<{P) 
1-ft ’ 


(3.7) 


(3.8) 



282 


ROBERT A. POLLAK 


and 


r*(P) = 


V%P)K 1 - ft) 

£ l/wW/O - ft)] ' 


(3.9) 


In [6] I considered the implications of linear habit formation for the 
demand functions generated by the additive utility functions (2.3)—(2.5); 
all of these yield systems of demand functions with linear Engel curves, 
and they exhaust the class of additive utility functions which do so. 
I examined the long-run demand functions and exhibited the utility 
functions which rationalized them. 


Theorem. The long-run demand functions corresponding to (2.3) can be 
rationalized by the utility function 


U(Q) = ^A k log(q k -B k ), ^>0, (<7i — Bi) > 0, ^A k = 1 (3.10) 

where A, and B { are given by 


A = a ‘K [ - ft) 

‘ £ Ml - ft)] ’ 


Bi ■= 


i -ft 


(3.11) 


The long-run demand functions corresponding to (2.7)-(2.9) can be 
rationalized by the utility functions 


U(Q) = - B k y, 

U(Q) = Z,Mq k -B k Y, 
U(Q)=-'LA 1t (B k -q k )', 


c< 0, A ( > 0, {q i -B t )> 0, (3.12) 

0<c<l, A t > 0, (q i — B i )>0, (3.13) 
c>0, ^>0, 0, (3.14) 


respectively, where A, and B t are given by 


Ai — 


(i - ft) 1- * 1 ’ 



(3.15) 


The long-run demand functions corresponding to (2.5) can be rationalized by 
the utility function 

V(Q) = -I A k exp (3.16) 


where A t and B < are given by 



HABIT AND UTILITY FUNCTIONS 


283 


The reader should convince himself that the long-run demand functions 
cannot be rationalized by the utility function obtained by replacing q it by , 
q { and b it = b ( * 4- by b t = b<* + in the shbrt-run utility func¬ 

tion. For example, in the case of the linear expenditure system, (2.3), this 
yields the utility function 

Z a k logfe* -(b k * + p k q k )], (3.18) 

but it is easy to verify that maximizing (3.18) subject to the budget 
constraint does not yield the long-run demand functions implied by (3.7). 
The difficulty is that the individual treats 6 (t as a constant and not as a 
function of q it , so that maximization with respect to the lagged value of q ( 
which appears in b it is inappropriate. 9 

In [6] I also showed that, provided 0 < j5 t < 1, the dynamic demand 
functions corresponding to (2.3)-(2.5) are locally stable. If Q°, the 
consumption vector of period zero, is given, then the short-run demand 
functions determine Q 1 as a function of P 1 , ^, and Q°. In the same way, 

Q 2 is determined as a function of P s , /x 2 , and Q 1 , or, more conveniently, as 
a function of Q°, P 1 , /t t , P 2 , /x 2 . Thus, for any initial consumption vector 
Q° and any price-income sequence {(P 1 ,/xi), (P 2 , p 2 ),...} the short-run 
demand functions determine the corresponding consumption sequence, 

{Q\ Q 2 , 

The long-run demand functions identify the steady-state consumption 
vector, Q*, corresponding to the price-income situation (P*, /x*). Clearly, 
if g° = Q* and {(P 1 , ^i), (P a , fa),...} — {(P*, p*), (P*, p*),...}, then 
{0\ Q *>•••} = { Q*, Q*, - }- In [6], I show that, if Q° is sufficiently close to 
Q*, then the consumption sequence corresponding to {(P*, /x*), 
(P*, fjL will converge to Q*. The stability proof given in Poliak [6] 
applies to any system of demand functions locally linear in income; hence, 
provided 0 fi t < 1, the stability of the dynamic demand functions (3.4) 
is guaranteed. 


4. The Existence of the Long-Run Utility Function 

In this section I prove the central theorem of this paper. Informally, it 
can be stated as follows: Consider a world with more than two goods, in 
which the demand functions are locally linear in income and are subject to 
linear habit formation. In such a world, the long-run demand functions 

• A good illustration is provided by the short-run utility function U(Q t ; Qt-i) = 
V(Qt) + V*(Qi-i). The long-run utility function is given by I V(Q) = V(Q), not by 
m<Q) ----- V(Q) + V*(Q). 



284 


ROBERT A. POLLAK 


can be rationalized by a utility function if and only if the short-run demand 
functions are generated by an additive utility function. More formally. 

Theorem. Suppose « > 3, that the short-run demand functions are 
locally linear in income, 

q» = b„ +f<(Pt) + ^ [m. -f(Pt) ~ I b ktPkt ], 

where f and g are functions homogeneous of degree one with g f (P t ) 4 0 and 
that b (t is given by the linear habit function (3.1), 

bu — bf 4- PiQu-i - 10 

Then the corresponding long-run demand functions (3.7) can be rationalized 
by a long-run utility function if and only if 

g(P)=np a f and f(P) = £ b kPk (2.16) 

or 

g(.p) = [Z a *Pk[' and f( p ) = X b *Pk ( 21? ) 

or 

g(P) = X a k p k and f{P) = £ b k p k + a k p k ) log (£ a k p^ 

—'ZakPk log p k . (2.18) 

It might appear paradoxical that in each period the short-run demand 
functions can be rationalized by a short-run utility function v while the 
long-run demand functions cannot be rationalized by a long-run utility 
function. Indeed, one might reason that, as the short-run demand functions 
converge to the long-run demand function, the short-run utility function 
should converge to the long-run utility function. But this is incorrect. Let 
Q* = h(P\ p, t , Q ,_1 ) denote the short-run demand functions. Given prices 
P* and income p.*, the quantity demanded will converge to Q* where 
Q* = h(P*,p.*, Q*). The short-run demand functions are generated by 
maximizing the short-run utility function U{Q\ Q (_1 ) with respect to Q\ 

10 Ruling out gi = 0 excludes goods which are on the border line between inferiority 
and noninferiority. This avoids problems about division by zero in an already long and 
messy proof. A similar assumption was made in conjunction with additivity in 
footnote 3. 



HABIT AND UTILITY FUNCTIONS 


285 


but, as we saw in the example at the end of Section 3, the long-run demand 
functions are not generated by maximizing V(Q) — U(Q, Q ) with respect . 
to Q. In the limit, the sequence of short-run utility functions approaches a 
limit—namely, the short-run utility function U(Q, Q *)—and maximizing 
this short-run utility function with respect to Q generates the short-run 
demand functions Q = h(P, ju, Q*). At these short-run demand 

functions coincide with the long-run demand functions, but, in general, 
they coincide only at the point (P*, ju.*) and not for all (P, p), even in a 
neighborhood of (P*, p*). 

Proof. Necessity follows directly from the results cited in Sections 2 
and 3. If the indirect utility function is of the form (2.16), (2.17) or (2.1), 
then the direct utility function is given by (2.3), (2.4), or (2.5) (Poliak [7]). 
But each of these cases leads to long-run demand functions which can be 
rationalized by a utility function (Poliak [6]). 

Sufficiency is more difficult. We first show that g is of the required form. 

If the long-run demand functions (3.7) can be rationalized by a utility 
function, then they can be written in the Gorman form 


hfP, /*) - F,(P, ft b*) - 


G t (P »ft b*) 
G(P,p,b*) 


F(P , ft b*) -f 


G<(P,p,b*) 

G(P,/3,b*) 


(4.1) 


since they are locally linear in income. Notice that F and G are functions 
of P, ft and b* and are homogeneous of degree one in P. Equating (4.1) 
and (3.7) and differentiating with respect to n yields 


r\p, p) = 


GfP, p, b*) 
G(P, p, b*) • 


(4.2) 


Calculating T/ and F,’ from (4.2), we find that 


r/(p, p, b*) = ry(p, p, b*), (4.3) 


where subscripts denote partial derivations with respect to price. This 
relation must hold as an identity in the P’s and b *’s as well as the P’s. To 
simplify our notation, let 




PkY k 
1 -Pk' 


We calculate from (3.9) 


r/(p,p,b*) = [ 


Yi 


(1 ~ Pi) 


S 


Y i 

1 -ft 


I 


P*Y* 

1 -ft 



l - ft 1 - ft 


s-\ 



286 


ROBERT A. POLLAK 


The last term is symmetric in i and j, so (4.3) implies that the factors in 
brackets must also be symmetric. Equating the term in brackets with the 
corresponding term in i"V and differentiating with respect to , t =£ i,j, 
yields 

_J_ [yiyt _ yiyfl = yJ — _ yiyfl. 

Differentiating with respect to j3< implies 

ttV - yVi* = 0. 

Replacing y‘ by gjg, we find that this implies 


but 


StjSt SuSi 

4 - (—) = o 

*pi x gt' 


(4.4) 


if and only if (4.4) holds. Hence, g{P) is additive in the ordinal sense. But 
g(P) is also homogeneous of degree one, and the only functions which are 
both additive and homogeneous of degree one are those of the Bergson 
family: 


^-s 

3 

li 

4 

Z fl * = ] > 

(4.5) 

g(P) = (Z a *Pk c 

, 1 /e 
) ’ 

(4.6a) 

g( p ) = Z a *P* • 


(4.6b) 


Clearly, (4.6b) is a special case of (4.6a), corresponding to c = 1. The 
Cobb-Douglas case (4.5) is the limiting case corresponding to c = 0. Thus, 
g must be of one of the three forms asserted in the theorem. - 

Unfortunately, we cannot identity B' with F ( . On reflection, this is as it 
should be, since the Gorman form does not uniquely identify F. Since 
f — J and/ =/ + Dg, where D is any constant, imply the same demand 
functions, our theorem’s claims about the form of / must be understood 
to refer to the canonical form in which D = 0. 

Let T* be defined by 


T* = B ( - r'Y.PkB*. (4.7) 

If the long-run demand functions are theoretically plausible, then 



HABIT AND UTILITY FUNCTIONS 


287 


From (4.2), I s = Gi/G, so we can subtract the income terms and obtain 
T* = Ft-^F = F t - r<F. 

(j 

We calculate 

77 = f h - tyF - r>F t 

and 

r*p - 77 = (r% + r>Ft) + (/vf - f h ~ pdf'). (4.8) 

Equation (4.8) is symmetric in i and j since J 1 / — J*7 from (4.2). 

To pursue the implications of the symmetry of (4.8), we replace by 
a t = 1/(1 — Pi). Replacing V by (4.7), where the JJ’s are given by (3.7), 
replacing 77 by the corresponding partial derivative of V, and dropping 
terms which are clearly symmetric in i and j yield 

+ r i Y.Pk oc kfki ■ (4.9) 

Since (4.9) was obtained from (4.8) by dropping symmetric terms, (4.9) 
must also be symmetric in i and j. Replacing T* by the corresponding 
expressions involving g’ s, (4.9) becomes 


and the symmetry implies 

~° L lfli Y.Pk^kgk + <*jgj Y. Pk a kfki — — a ifit X Pk^kSk + <*igi '£ l Pk< x kfki • 
Dilferentiating with respect to a,, j ^ i,j, and dividing by p, yield 

+ ^-i gif si — ~ a ifu gs + “ i gif si • 

Differentiating with respect to otj yields 

~fi<gi + gifn — 0, 

so, changing subscripts in the interest of clarity, 

4^ = —> s,r*i. (4.10) 

Jir gt i 

We now use the homogeneity of / and g to determine a similar 
restriction involving/** . From (4.10) 


fisgr — gsfri > 


r,s=£ i. 



288 


ROBERT A. POLLAK. 


Multiplying by p, and summing over s, s =£ i, yields 


gr Y PJ>* = fn YP»g>- (4-U) 

B+i a+t 

But since g is homogeneous of degree one, 

PaSa ~ & PiSi • 

Since /< is homogeneous of degree zero, 

Y P*f<‘ = ' 

Hence, (4.11) implies 

- $- ~-£ igi . (4.12) 

/ir P.gr 

It is easy to verify that (4.10) and (4.12) imply 


/< = 



(4.13) 


where the W’s are arbitrary functions. We must now establish restrictions 
on the 'F’s. Since f iS = , (13) implies 


Hence, 


W (-£-) 

1 g± ~ wy 

(— 

)- 

\pi i 

Pi 

Vp, , 

' Pi 

y'xglpd 

Pigi 

- P(P), 

/ = 

1 ,... 


(4.14) 


where J? is a function of P, independent of the index /. The left-hand side 
of (4.14) is homogeneous of degree — 1, so R is a function homogeneous 
of degree —1. Hence, if R is a constant, it must be 0; we begin by con¬ 
sidering that case. 

If P(P) = 0, then (4.14) implies 


W (-£-) = 0, 

\ Pi l 

so I pi) = bi . Hence, /<(/*) — b ( , and, since / = ZPkfk, 

' nn = 2 > a , 

which is consistent with our theorem. 



HABIT AND UTILITY FUNCTIONS 


289 


If R # 0, we differentiate (4.14) with respect to p t , t i, and, forming 
ratios, we find 


g gjRtgi + Rgu) _ Rtg_ , ggtt_ 
W Pt Xgigt Rgt ^ gigt ‘ 


(4.15) 


But, since g is given by (4.5), (4.6a), or (4.6b), 


ggit 

gigt 


= 1 - C 


and is independent of P and the indexes i and t. Since the right-hand side 
of (4.15) is independent of the choice of t, we must have 


Rtg _ R»g 

Rgt Rg, 


(4.16) 


for all s ^ i. 

Since R is not constant and is homogeneous of degree zero, we may 
pick an s ^ i for which R, ¥= 0 and write (4.16) as 


which implies 


Rt gt 

R. g, ’ 


where D is a constant. Hence, 


and (4.15) becomes 


Hence, 


R t g i 

Rg t 


¥«'(*<) -C 

¥"(*<) ~ zt ‘ 

(4.17) 

w{ Zl ) = a^r. 

(4.18) 


We now treat separately the cases of (i) c 1 and (ii) c — 1. 
(i) If c ^ 1, (4.18) implies 


¥“(*<) = 6* + d.z'r. 



290 


ROBERT A. POLLAJC 


where d< = 3</( 1 — c). This implies 

ft = b t + d< (-^-) 1C (4.19) 

and 

/ = EpA + g l_e E 4fcP* e - (4.20) 

If c — 0, (4.20) becomes 

/ = E P A + 

where D = 2 d k and is consistent with our theorem. 

If c ^ 0, differentiating (4.20) with respect to p { yields 

A = b t -|- (1 - c) g~ c g t E dkPk + g l ~°d iC pi~ l . 

Replacing/ by (4.19) and subtracting b, yields 

dig'-YC 1 = (1 - c)g~ c gf E d k p k c + dtCg^Yf- 1 , 

so that 

diPt l g = gi E <4P* C - (4.21) 

There are two subcases to consider: (ia) If £ d k p k c = 0 for all p, then 
d ( ~ 0 for all i and (4.20) implies / = p k b k , as asserted in our theorem. 


(ib) If Ite' 

0 for all p, then 

from (4.21), d, =7^ 0 for any i. Hence, 

(4.21) implies 




i± - 

dtpV 


gi 

diPT 1 ' 

But from (4.6) 




gi = 

ciiPT 1 


Sj 

a iPT l ’ 

so 




d ( = 23a,, 

i — 1. «• 


Hence, 

E d kPk = D E a*/ 1 *' = dr, 

and (4.20) becomes 

y = E + D s 


as asserted by our theorem. 



HABIT AMD UTILITY FUNCTIONS 


291 


(ii) If c = 1, (4.18) implies 

¥”(*<) = b { + d t log z { , 
where d t = 3 { . This implies 

ft = b t + d t log g — d t log pt (4.22) 

and 

/= E4- (E dyP>) log g - E log Pk • (4.23) 

Differentiating (4.23) with respect to p t yields 

f = 6, + log g + - d t - (/j log /ij. (4.24) 

o 

Equating (4.22) and (4.24) and subtracting A, 4- */, log g — </, logp* , we 
find 

fZPkdk^dt. (4.25) 

There are two subcases to consider: (iia) If £ d k p k = 0 for all P, then 
d t — 0 for all / and (4.2) implies / = b k p k . (iib) If £ d K p k ^ 0 for all P, 
then, from (4.25), d t 0 for any i. Hence, (4.25) implies 

g L= d ± 

g, dr 

which implies that the d ’s are proportional to the a’s: 

' d ( = Da { , i — 1,..., n. 

Hence, 

E d *p* — D E a *p* • 

and (4.23) becomes 

/ = E ^ (E a *P*) 1°8 (E ~ D E Ot/’fc 1°8 Pt • C 4 -26) 

We must now show that the class of indirect utility functions defined 
by (4.6b), 

g(P) = E a kPk > 


and (4.26) is equivalent to (2.18), although the a’s are defined differently 



292 


ROBERT A. POLLAK 


in {4.6b) and (4.26) than in (2.18). To show the equivalence of the two 
classes, it is useful to rewrite (4.6b) and (4.26) replacing a t by a ( ; we do 
this to make it clear that we are asserting the equivalence of the classes of 
utility functions and not the identity of particular coefficients. Rewriting 
(4.6b) and (4.26) yields 


g = Y a kPk, (4.6b') 

/= X b k Pk + D (Xlog (X a k p^j - D X a k p k logp*. (4.26') 

We next observe that we can multiply g by a positive constant without 
altering the indifference map. We define a { by 

a, = Da t 

and rewrite (4.6b') and (4.26') as 

g = X a kPk , (4.6b") 

/ ~ X KPk + (X a kPk) log (X atPkj - X «*/>* log p k 

~ (x a kpk) log D. (4.26") 

But the Gorman form does not uniquely identify/, and there is no way to 
distinguish between f — J and f ~f + kg, where k is any constant. Since 
the last term of (4.26") is — g log D, (4.6b") and (4.26") represent the same 
preference ordering as (2.18). 


5. Welfare Implications 

Von Weizsacker argues that the long-run utility function is the appro¬ 
priate criterion by which to judge the welfare effects of changes in con¬ 
sumption. Since we have shown, contrary to his conjecture, that when 
there are more than two goods the long-run utility function exists only in 
special cases, it cannot serve as a general welfare criterion. Revealed 
preference arguments provide no help; the counterpart of the nonexistence 
of the long-run utility function is violation of the long-run version of the 
strong axiom of revealed preference. At most, then, the long-run utility 
function may be the appropriate welfare criterion only for a narrow class 
of cases in which it exists, and I shall argue that there are compelling 
objections to it even in these cases. 



HABIT AND UTILITY FUNCTIONS 


293 


Von Weizs&cker [9, p. 352] shows that the long-run indifference curves 
are related to binary choice behavior: 

In other words, it is possible for the consumer to go from G® to Q in a finite 
number of periods and always feel improved compared to the already attained 
status quo of the last period if and only if Q lies above the indifference curve 
going through Q°. 

Let v(Q*) denote the short-run preference relation given that consump¬ 
tion in the previous period was Q*; then Q a n(Q*) Q b means that the 
consumption vector Q a is preferred to Q b on the basis of the short-run 
preference ordering corresponding to past consumption of Q*. 
Von Weizsacker’s assertion, then, is that there exists a finite sequence of 
vectors {0°, Q\..., Q n = Q] such that 

2M0°) Q°, 2M2 1 ) Q 1 , -, Q = QMQ”- 1 ) Q n ~ x (5.1) 

if and only if Q lies on a higher long-run indifference curve than Q°. He 
does not claim that (5.1) holds for every sequence of vectors 
{0°, Q 1 ,..., Q n — Q) —clearly, it does not. Nor does he claim that it is true 
for the doubleton sequence {Q 1 , Q}. He claims only that there exists at 
least one sequence satisfying (5.1) if and only if Q lies on a higher indif¬ 
ference curve than Q°. 

Von Weizsacker continues: 

In this sense, the long-run indifference curves exhibit the “long-run preference 
structure” of the person. 

Presumably this is intended as a definition of the “long-run preference 
structure.” It is not a convincing argument that the long-run utility 
function coincides with our intuitive notion of an individual’s long-run 
preference ordering. To see why, we must examine the implications of 
habit formation for binary choice behavior and the relation between 
binary choice and the long-run utility function. 

First consider the short run. Let i r(g°) and tt(Q b ) denote the short-run 
preference relations implied by consumption of Q a and Q b in the previous 
period. It is certainly possible to have Q a ir(Q a ) Q b and Q b rr{Q b ) Q a . That is, 
Q a is preferred to Q h on the basis of the preference ordering induced by 
past consumption of Q a , while Q b is preferred to Q a on the basis of the 
preference ordering induced by past consumption of Q b . For example, 
suppose the short-run utility function is the linear expenditure system 


U(q u , t) = i log (<?it - b u ) 4- \ log (q zt - b 2t ), 
where b it = \qu-i • It is easy to verify that, if Q a — (6, 4) and Q b = (4, 7), 



294 


ROBERT A. POLLAK 


then Q a ~rr(Q i ) Q b and Q b tr(Q i ) Q a . This is not a contradiction; it is the 
essence of the habit hypothesis. Preferences depend on past consumption, 
and different consumption histories imply different short-run preference 
orderings. 

In this case the long-run demand functions can be rationalized by the 
long-run utility function 

V(q \. <?a) = i log <7i + i log <? 2 (5.3a) 

or, equivalently, 

U*(9i, ?z) = • (5.3b) 

Hence, Q b — (4, 7) is on a higher long-run indifference curve than 
Q a — (6, 4). We have already seen that in a binary choice situation the 
individual’s preference between (6,4) and (4, 7) depends on his past 
consumption. This does not contradict von Weizsacker’s assertion that 
there exists a sequence of consumption vectors by which the individual 
can go from Q a to Q b in a finite number of periods and always feel 
improved (on the basis of his current short-run tastes) compared to the 
consumption vector of the previous period. It does imply that 
von Weizsacker’s assertion is true only for carefully chosen sequences, 
and that it need not be true for the sequence consisting of only Q a and Q b . 
However, congruence with binary choice is the sine qua non of a utility 
function representing preferences, so the dependence of the individual’s 
preference between (6, 4) and (4, 7) on his consumption history casts 
serious doubt on the possibility of interpreting his choice behavior in 
terms of a consistent preference ordering. 

It might be objected that focusing on a single binary choice between 
(6, 4) and (4, 7) fails to capture the long-run aspect of the situation. But if 
the individual is offered a finite or an infinite sequence of binary choices 
between these two collections, then his choice will still depend on his 
initial consumption pattern. Suppose he is offered an infinite sequence of 
binary choices in which, at each decision point, he must choose between 
(6, 4) and (4, 7), and that his initial consumption pattern was (6, 4); then 
at each decision point he will choose (6, 4). Similarly, if his initial consump¬ 
tion pattern was (4, 7), then at each decision point he will choose (4, 7). 

The issue, then, is whether the existence of a von Weizsacker sequence 
which enables us to go from Q° to Q in a finite number of steps, feeling 
improved at each step, implies that the individual is better off—in terms 
of his preferences—at <2 than at Q°. I interpret the individual’s willingness 
to move from Q° to Q in a sequence of small steps when he is unwilling to 
do so in a single large step as indicative of his failure to understand the 
habit formation mechanism and not of the underlying superiority of Q- 



HABIT AND UTILITY FUNCTIONS 


295 


For example, a nonsmoker might prefer to remain a nonsmoker rather 
than smoke three packs of cigarettes a day, but he might choose to smoke 
half a pack a day rather than abstain completely. After becoming 
accustomed to smoking half a pack a day, the individual might prefer to 
remain a light smoker rather than smoke three packs a day, but he might 
choose to smoke a pack a day rather than continue at half a pack a day. 
By this process, the myopic nonsmoker is led to become a heavy smoker. 
This scenario is entirely consistent with von Weizs&cker's assumptions, 
yet I am loath to conclude that the individual is better off at (? than at Q°. 
Note that the issue is not whether cigarette smoking is harmful to his 
health; according to our individualistic welfare premise, if a man prefers 
three packs a day to none, we would say that he is better off with three. 
The example, with appropriate repackaging, applies equally to cigarettes, 
candy, or artichokes. 

To recapitulate, the individual’s long-run demand behavior is consistent 
with the hypothesis that he is maximizing the long-run utility function (5.3), 
but we cannot conclude that a movement from a lower indifference curve 
to a higher one—e.g., from (6, 4) to (4, 7)— would make him better off. 
If the individual is offered a nonmarket choice between these two baskets— 
for example, a political choice in which he must vote for one or the other 
of these two consumption patterns—then his choice will depend on his past 
consumption history. If his previous consumption has been (6, 4). then 
he will prefer (6, 4); if it has been (4, 7), he will prefer (4, 7). The fact that 
the long-run utility function assigns a higher value to one than the other 
does not imply that it will be selected in this binary choice situation, and it 
does not imply that it is a point of higher welfare. 

From the standpoint of positive economics, the fact that the long-run 
demand functions were not generated by maximizing the long-run utility 
function (5.3) is completely irrelevant; the utility function is just a con¬ 
venient device for coding all of the information about demand behavior. 
From the standpoint of normative economics, an individual’s utility 
function is significant because it represents his preferences. If we assume 
that social welfare depends on individual welfare and that each individual 
is the sole judge of his own welfare, then individual utility functions are 
germane to welfare economics. Ordinarily, there is no need to distinguish 
between these two interpretations of the utility function. If a utility 
function rationalizes an individual’s demand functions, we usually assume 
that it also represents his preferences. In the habit formation model, this 
is an unwarranted assumption; there is no long-run preference ordering, 
and the distinction between the demand and preference interpretations of 
the utility function is critical. 

The long-run utility function is the same type of construct as a com- 


6 42/l3/2-9 



296 


ROBERT A.. POLLAK 


munity indifference map which rationalizes market demand functions; if 
it exists, it is a convenient device for coding all of the information about 
demand behavior, but this is all (see [8]). In general, market demand 
functions cannot be rationalized by a “market utility function.” In those 
special cases in which they can be, the utility function must be scrupulously 
interpreted in terms of positive economics; it has no normative or welfare 
significance. 

Since von Weizsacker’s long-run utility function approach to the 
evaluation of welfare is technically possible only in a narrow class of cases 
and is conceptually unsatisfactory even for those, it is desirable to consider 
alternative approaches. One such alternative, one which von Weizsacker 
suggests in his concluding paragraph, is to view the problem in an inter¬ 
temporal framework. That is, instead of focusing on the one-period 
utility function, U(Q t , Q t -i )—assumed to be the same in every period—we 
evaluate welfare in terras of the intertemporal utility function 


V(Q) = W[U(Q t , Qo), U(Q 2 , Q t ) . U(Q r , Q r _ t )]. (4.3) 

The proposed welfare test, in other words, is whether the individual— 
taking full account of the impact of his present consumption on his future 
tastes—would be willing to undertake a particular change. 

The principal difficulty with this approach is that it is schizophrenic. The 
habit formation model is tractable because the individual is assumed to 
be myopic—he fails to recognize the impact of his current consumption on 
his future tastes. If he could be persuaded to recognize the effects of habit 
formation, then it would certainly be appropriate to base welfare com¬ 
parisons on the intertemporal utility function, but then his demand 
behavior would be far more complex than that predicted by the model of 
“myopic” habit formation. If an individual insists on being myopic, it is 
less clear that the intertemporal utility function is the appropriate welfare 
criterion, but it is tempting to take a paternalistic view and argue that it is. 
However, it is difficult to reconcile this approach to welfare with an 
approach to demand analysis based on myopic habit formation. 

A second alternative to von Weizsacker’s approach retains the one- 
period framework, but, instead of beginning with the short-run utility 
function and an assumption about the way it depends on past consumption, 
it begins with a long-run (single-period) utility function and a lagged 
adjustment hypothesis. The latter is invoked to describe the movement 
from one long-run equilibrium to another. Only in a narrow class of cases 
will there exist a utility function which rationalizes the short-run demand 
functions, but the welfare analysis can be conducted on the basis of the 
long-run utility function. I find the assumption that an individual’s 



HABIT AND UTILITY FUNCTIONS 


297 


short-run behavior is consistent with a preference ordering more plausible 
than the corresponding assumption about long-run behavior, but this is 
clearly an empirical matter. The direct specification of a long-run utility 
function and an adjustment hypothesis seems to correspond more closely 
to von Weizsacker’s verbal exposition in his Section 5 than do the short-run 
utility-function-habit-formation models which provide the basis for his 
mathematical analysis and mine. 


6. Summary 

The two principal contentions of this paper are: first, in a model of 
habit formation with more than two goods, only in a narrow class of cases 
does there exist a utility function which rationalizes the long-run demand 
functions. This contention is supported by a critical counterexample 
based on an exhaustive analysis of the case of linear habit formation and 
linear Engel curves. Second, even when the long-run utility function 
exists, it is not an appropriate welfare criterion. The long-run utility 
function, when it exists, does not reflect “long-run preferences,” but is 
merely an indicator of long-run behavior. 


References 

1. W. M. Gorman, On a class of preference fields, Melroeconomica 13 (1961), 53-56. 

2. W. M. Gorman, Tastes, habits and choices, Int. Econ. Rev. 8 (1967), 218-222. 

3. H. S. Houthakker, The present state of consumption theory, Econometrica 29 
(1961), 707-740. 

4. I. Lakatos, Proofs and refutations, Brit. J. Phil. Sci. 14 (1963-64), 1-25, 120-39, 
221-43, 296-342. 

5. M. McCarthy, On the stability of dynamic demand functions, Int. Econ. Rev. 
15 (February 1974), 256-259. 

6. R. A. Pollak, Habit formation and dynamic demand functions, J. Polit. Econ. 
78 (1970), 745-763. 

7. R. A. Pollak, Additive utility functions and linear Engel curves, Rev. Econ. Stud. 
38 (1971), 401-414. 

8. P. A. Samuelson, Social indifference curves, Quart. J. Econ. 70 (1956), 1-22. 

9. C. C. von Weizsacker, Notes on endogenous change of tastes, J. Econ. Theory 
3(1971), 345-372. 



JOURNAL OF ECONOMIC THEORY 13, 298-318 (1976) 


Adaptive Behavior, Demand and Preferences* 

Ahmad E. El-Safty 


Cairo University, Cairo, Egypt and Eastern Michigan University, 
Ypsilanti, Michigan 48179 

Received March 26, 1975; revised November 11, 1975 


This paper extends the work on endogenous change of tastes of Von Weizsacker 
to the n-commodity framework and for a general adaptive behavior process. 
The paper examines the relation between the effect of taste changes to income 
and price changes. It provides sufficient conditions for stability of the under¬ 
lying dynamic process, establishes uniqueness of the equilibrium demand 
vector and some useful relations between the long-run demand functions 
and the equilibrium short-run demand functions. It is also shown that the 
long-run demand functions can be rationalized by a utility function if and 
only if the short-run utility function is such that any good that experiences 
learning or taste change is separable from all other goods. 


1. Introduction 

This paper extends the works of Von Weizsacker [6] on endogenous 
change of tastes to the «-commodity framework and for a general adaptive 
behavior process. In particular, the paper derives the short-run demand 
functions, and examines the effect of learning or taste changes in relation 
to the effect of price and income changes. It then deals with the question 
of stability of the underlying dynamic process, uniqueness of the equi¬ 
librium demand vector, and establishes some useful relations between 
the long-run demand functions and the equilibrium short-run demand 
functions. In addition, necessary and sufficient conditions for the long-run 
demand functions to be rationalized by a utility function have been found. 


* 1 am indebted to the associate editors, R. Eckaus, R. Solow, and F. M. Fisher for 
their advice and helpful comments on a number of points. 1 am also grateful to the 
Department of Economics of the Eastern Michigan University. This research was 
partly supported by a HEW Research Grant entitled The Methods of Demographic 
Projection Analysis Project conducted at the Computer Center, The American University 
in Cairo. 

298 

Copyright © 1976 by Aggdemic Press, Inc. 

Alt rights of reproduction in any form reserved. 



ADAPTIVE BEHAVIOR 


299 


2. The Short-Run Demand Functions 

Weizsacker’s dynamic model, and that of Houthafcker and Taylor {3} 
and Pollack [5], express the generally accepted idea that current decisions 
are influenced by past behavior. The effect of past behavior is assumed to 
be represented entirely by the values of past period consumption. 
Weizsacker’s short-run demand functions take the form: 

Xi = XAPi. y‘; xf l , *S -1 ), i — 1.2, (2.1) 

where p’s, A”s, and y’s denote prices, quantities, and income. 

In El-Safty [1], I found it desirable and perhaps more illuminating to 
reformulate the dynamic demand model in terms of the widely accepted 
notion of “learning by doing.” Let the consumer’s ordinal utility function 
be of the form 

U( ) = U(4>\ #. 4> n ), (2.2) 

where (i/r l , </r") are the current services provided by the purchases 
of the n-commodities, (X l , X a ,..., X n ). The service function i/>‘ is assumed 
to depend on current purchases of good i and on the learning (technology) 
parameter of good i, ©, . That is: 

<£< = ,/,<(*,., 6> t ), i — 1,2,..., n. (2.3) 

The function ip' is assumed to be a strictly increasing function of A',. 
The technology parameter, or reaction potential, of good i, & t , is assumed 
to be directly correlated with a number of variables: the habit strength, 
psychological, or physical stock of good i, H t ; derive, D t ; incentive 
motivation, K, ; and stimulus-intensity dynamism, K t . (See ( 1 > Chap. II], 
for a review on learning theory.) In functional form, 0, is given by: 

0, = @i(H { , Di, V ( , K { ). (2.4) 

Derives are intervening variables tied to the magnitude of psychological 
stock variable, H { . The greater the psychological stock (i.e., the more the 
consumer fulfils his needs for commodity /), the smaller the strength of A . 
V, and K t are like D t in that they are performance variables which have 
their effect directly on the reaction potential. Changes of the quality of 
the good will change V. Imitations, social interactions, and changes in the 
social norms will show their effect on © through K. Awareness about the 
content of the good, say, magnitude of mercury, calorie content, percentage 
of fat, etc., made available by advertising or some other means, affects 0 
though K and/or V. (For complete discussions about the underlying 
process, see [1, Chap. III].) 



300 


AHMAD E. EL-SAFTY 


Parametrization of learning (or taste) change as in (2.2) has been 
employed by Fisher and Shell [2] and El-Safty [1]. Fisher and Shell have 
treated the case i/>‘ = g(@i) X t , that is, when learning has a purely 
augmentation effect. Also, Houthakker and Taylor’s dynamic demand 
model has been derived from (2.2) in the special case when U( ) is quadratic 
and tfi* — X t — ot<H { . 

In this paper, we shall assume that V and K are exogenous while H 
changes according to: 

H,* = (1 - hi) Hr + X}~\ 0 < ^ < 1. (2.5) 

And in the continuous case, (2.5) takes the form: 


fij = -hiH, + X it 0 < hj < 1. (2.6) 

In each period, the vector 0 is given from (2.4) and (2.5) and the 
consumer is assumed to purchase a combination of the n commodities 
from which he derives the highest level of satisfaction. The consumer’s 
problem is, then, one of short-run maximization rather than long run. 
Although changes in 0 are perceived by the consumer, we assume that 
he does not anticipate these changes. Or, if he does anticipate them, 
we assume that the consumer is not aware of the underlying mechanisms 
by which these changes take place. In reality, however, our assumption 
may be violated due to education experience which broadens peoples’ 
horizons and enables them to understand causality. This point is brought 
out in the following anecdote (related to me by F. M. Fisher): Johnny 
was asked why he didn’t eat spinach. He replied “I don’t like spinach.” 
To this an adult responded. “Try it, you’ll like it.” “I know,” Johnny 
replied, “but I don’t want to like it.” The question of long run maxi¬ 
mization will not be considered here (see, for example, El-Safty-[I, Chap. 5 
Sect. 3]). 

The short run demand functions: 


Xi* = X { ( Pl , Ps ,..., Pn ;y, 0 1 (H 1 t ), ©„(//„')) 

satisfy the first-order conditions: 


i = 1,2,..., n 
(2.7) 


WUi - A Pi = 0, / = 1 , 2,..., n, (2.8) 

^P<X i -y = 0, (2.9) 

where A is a Lagrange multiplier and is the marginal utility of income. 

In El-Safty [1], I prove the following: 



ADAPTIVE BEHAVIOR 


301 


Theorem 2.1. 

(A) dXJ66 t = S, + p&Wi ~ 1 )(8X,/dy) + pffiSJdX,) • (dXJdp,), 

(2.10) 

(B) BXJB6, = Pi S,(a i - 1) (BXJBy) + ptfSJBXd ■ (tXJd Pj ), i # j, 

(2.11) 

where 

S 3 = -(fc'MO (2-12) 

i5 //ic j/ope <?/ i/if indifference curve tf’KXj , @y) = constant, and 

Oj = (Ays,) • (dSf/dXj) (2.13) 

is the elasticity of substitution. 

Note. a } = 0 if and only if ifi 1 — X 3 — ff(Oj) and cr 3 = 1 if and only 
if <p } = gi(@j) Xj , see [1, Lemma 4.2.3]. 

The case o s — 0, is the pure habit forming case and we may measure ©, 
in units of ff . The case oj = 1 is the pure learning case, i.e., when 
learning has a purely augmentation effect on the services provided by X, 
units of commodity j. In such case, we may measure in units of g f . 

Corollary 2.1 (Fisher and Shell). If commodity j is subject to pure 
learning changes, then 

(A) £„ + !,„ =-I, (2.14) 

(B) £(J + 1J« = o, (2.15) 

where 

Vo = {PilXMXJSP,) (2.16) 

is the gross “price" elasticity of demand for the ith good with respect to 
the jth price, and 

in = (0 f /x t y(dxjae t ) (2.17) 

is the gross “ learning ” elasticity of demand for the ith good with respect to 
the jth learning parameter. 

Proof. The corollary is an immediate consequence of Theorem (2.2) 
and the definition of i) fj and £ it and by noticing that in this case — SjX f , 
a, = 1, Sj = —Xjl&j and BSj/BXj = -1 19,. 

From the Corollary, we see that the ith demand equation is homogenous 



302 


AHMAD E. EL-SAFTY 


of degree zero in P } and Q, (i # j) while the jth demand equation is 
homogeneous of degree — 1 inp, and &,. Knowledge of learning elasticity 
requires only knowledge of price elasticity and vice versa. If the demand 
equation for a particular commodity is log-additive in p ,, then it is also 
log-additive in 6 ,. In the spacial case h s = 1, (i.e., influences from past 
consumption other than the last period are ignored), then &,* depends 
only on X] _1 . Thus a log-additive demand equation in both prices and 
lagged consumption of these commodities which have nonzero price 
coefficients is justified. 

Corollary 2.2. If commodity j is purely habit forming, then 

(A) BXJS&j = 1 - pfdXj/dy), (2.18) 

(B) 8XJ8&, = -pfdXJhy). (2.19) 

Proof. The corollary is an immediate consequence of Theorem 2.1. 
The proof follows immediately by noticing that in this case ji 1 — X, — , 

Sj — 1, and a t = 0. 

Another interesting way to prove the corollary is as follows. Using the 
transformation: 

x } * = x, - e,, 

y* = y + pj&i, 

we see that for y* fixed, the equilibrium purchases (X x , X 2 ,..., Xj*,..., X n ) 
are independent of the values of y and . We thus have: 

8XS/86, = (SXS/eyWy/dd,) |,.. C0 „ 8tBnt ], 

SXJdB, = (8X i /8y)[(8y/89 j ) U._ constant ]. 

Using the definition of X } * and y* yields the required results of the 
corollary. 

From the corollary, we see that if commodity j is purely habit forming 
(or durable); then the demand for good i is linear in the jth learning 
parameter if and only if it is linear in income. 

We introduce the following notations, referring to the demand functions 
(2.7). Let 

On = dXi/eHj = (dXjee,) ■ {dQ^dH,), 
bn = dXjdp ,, b iv = SXJdy, 

Sa = (dXJdpj) | u ._ u . 



ADAPTIVE BEHAVIOR 


303 


Lemma 2.1. 

Pi^iv^n = [1 Pi^iti] S<i • * = 2, 3,..., n (2.20) 

if and only if the underlying utility function is of the form 

l/(-) - F{+\ . +”)) (2.21) 

for some choice of continuously differentiable functions of F and <j>. 

Proof. Necessity. If the demand functions satisfy (2.20), then by using 
the Slutsky Equations: 


we get 


where 


Su = bu + x,b iv , i&j = 1, 2,..., n, 
bn -- , i — 2, 3,..., w, 

K = -( Pl b u + *,)/(! - pAvl 


( 2 . 22 ) 

(2.23) 


From the first-order conditions (2.8), we get 

0x'0i 1 ' UiAi+tnUA! + t 0»*0i*t/,Ai = pfdX/dp,), i # 1 (2.24) 
0iV • + + I 0/0i* U ik b ku = Pi (dX/8y), / * 1 (2.25) 

fc -2 

Multiplying (2.25) by —X and add to 2.24, then by using (2.23), we get 

- Kb lv ) = pf(d\ld Pl ) - K(8X/8y)\ i # 1 (2.26) 

in # /fi lv , for, if 6 U = //i lv , then from the definition of K, we see that 
Sn = 0, which is impossible since U () is assumed to be strictly quasi¬ 
concave. 

Also, since Pi = ipfUJX, then (2.26) implies 

U,u a - U i u il = 0 .,^r f 

which implies 


(W)(l/,■/£/,) = 0 i&j # 1. (2.27) 


Condition (2.27) is equivalent to (2.21) by a famous theorem of Leontief [4] 



304 


AHMAD E. EL-SAFTY 


Sufficiency. If the utility function is of the form (2.21), then the 
demand functions satisfy the first-order conditions 

F i^i 1 = Pi , (2.28) 

FtWti-x = A/»<, i = 2, 3,..., n, (2.29) 

Y.PiXi = y. (2-30) 

By eliminating A from (2.29) and using (2.30), we see that (X 2 , X 3 ,..., X n ) 
may be solved in terms of (y — pjA/). Thus, we have: 

Xi = f*(y — p 1 X 1 ,p i ,...,p n ,6 i ,...,G„), i ^ 1. (2.31) 

By differentiating (2.31) with respect to y and p 1 , we get; 

b it (-X i - P, bn) = Ml —Pibiv) (2.32) 

By using the Slutsky Equation, (2.32) reduces to 

PibiySn = [1 />iM ■S’ti > i — 2 .n, 

which completes the proof of the Lemma. 

Lemma 2.2. 


a./Syy = OjjSjj , / — 1, 2,..., «, / (2.33) 

/or some j, if and only if 

—PibiySjj = [1 — p,b,y\ S„ , / = 1, 2,..., n, i #/ (2.34) 

Proof. Using the Slutsky Equation, Theorem 2.1 may be written in 
the form 

(A) SXJdQ, = S/1 - pfdXfdy)) + pfSSfdX,) • , (2.35) 

(B) 0Ar ( /0@y = -pfi&XJBy) + pfdSj/dXj) • S„ , i # / (2.36) 

where S 3 is the slope of the indifference curve ftfXj , &,) = constant. 

Multiplying Equation (2.35) by Su and (2.36) by S if yields the required 
results of the Lemma. 

Theorem 2.2. 

-PibiySj, = [1 - Pjb iv ] S ti , i = 1, 2. n, (2.37) 

j = 1, 2 ,..., r, j ^ i, r < n 



ADAPTIVE BEHAVIOR 


305 


if and only if the utility function be of the form'. 

(A) £/(•) = F(0\ Z(0« 0* 0«)), > = 1. (2.38) 

(B) U( •) = F( Vl W) + Vt (^) + - + r, f m 

+ Z(0~V„, 0”)), r > 1. (2.39) 

Proof. Part (A) is a repetition of Lemma 2.1, so we concentrate on 
part (B). From Lemma 2.1 and by a similar argument leading to condition 
(2.28), we see that Eq. (2.37) holds if and only if: 

(S/00O(t4/CO) - 0, k = 1, 2,..., «, ; = 1, 2. r, i^j^ k. 

(2.40) 

Also, 


(d/B^XUJUj) =0, k = 1, 2,..., n, i — 1, 2.r, i ^ j ^ A:. 

(2-41) 

From (2.40) and (2.41), we see that 


(0/00*X£ty£/,) - 0 k — l, 2,..., n, i &j = 1, 2,..., r, i # j # A:. 

(2.42) 

Thus, for i&j — 1, 2,..., r, / # _/', UJUj is independent of all 0’s except 
tfi* and i fiK That is, 

UJU, = F»(0‘, 0 J ), ’ &7 = 1, 2,..., r, f (2.43) 

for some choice of the function F iJ . 

But since is independent of ft (Eq. 2.40) and UjjU k is independent 
of 0' [Eq. 2.41], F* 1 must be of the form 

F i} = £-(00/6(00, / & / = 1,2,..., r; (2.44) 

thus, the function £/(•) must be such that 

U i = W, 0 2 ,..., 0") £(00, * ~ 1> 2.r, (2.45) 

where K is any arbitrary function of 0 1 , 0*,..., 0". Thus 

U( ) = F(rj 1 (0 1 ) + rj 2 (0 2 ) 4--1- 7} r (0O, Z(0 r+1 ,..., 0”)), 

(2.46) 

where i?/(0O = £(00 and F, = F. 

From (2.46), we see that U l = Ftff and U T+1 = F t - Z l . But since 



306 


AHMAD E. EL-SAFTY 


Ur+JUx is independent of tfi 1 , then the function Fean be choosen such that 
F 1 — F t everywhere, which completes the proof of the theorem.' 

Remark. From part (A) of the theorem, we see that in the two 
commodity case, form (2.38) does not impose any new restriction on the 
utility function. Thus Eq. (2.37) will be satisfied any way in the two 
commodity case. 

The theorem is very constructive. From the theorem we see that any 
result in the two commodity case which utilizes the properties of the 
Slutsky terms will remain valid in the ^-commodity case if and only if the 
utility function is additive (part B) or at least separable (part A). 

Theorem 2.3. 

a iS Sn = QjjSij , i=l, 2,..., n, i j, j = 1, 2,..., r (2.47) 
if and only if the utility function is of the form 

(A) £/(•) = F(</,\ Z(ip 2 , <A")), r = 1 

(B) £/(-) = F( Vl W) + - + Vr (p) + Z(^>. +")), r > 1. 

Proof. The theorem follows immediately from Lemma (2.2) and 
Theorem (2.2). 

Corollary 2.3. 

—pA*a„ = [1 - Pjb, v ] a ti / = 1, 2,..., n, i¥=j (2.48) 

if 


(A) the utility function is separable in tfi 1 , or, 

(B) good j is purely habit forming. 

Proof. Part (A) follows immediately from Theorems (2.2) and (2.3). 
Part (B) is an immediate consequence of Corollary (2.2). 


3. Stability of the Adaptation Process 

For a given income and prices, the demand vector X 1 , all t, is obtained 
by solving the first-order conditions (2.8) and (2.9). Thus X ( is a function 
of P, y, and &\ and the demand functions may be written in the form: 


Xt* = XfP,y; ey,..„ Of) 


1 = 1, 2 


(3.1) 



ADAPTIVE BEHAVIOR 


307 


and since B/ is a function of H/, (3.1) may be written in the form: 

X/ = X { (P, y; H/, H/,..., H n <) i = 1,2. n. (3.2) 

From (3.2), we see that if the habit vector H converges to a limit, then 
the demand vector also converges. 

By the mean value theorem, the system of demand equations (3.2) may 
be written in the form 

AX/ = a tl AH/ + a< 2 AH/ 4- ••• + a tn AH/, i = 1, 2. n, 

(3.3) 

where the partial derivatives a {j — (dX/dHj) are evaluated at P, y, and 
PiH/ 4- (1 — 0 < & < 1. In matrix notation, (3.3) takes the 

form 

AX< = A(AH<), (3.4) 

where 

AX/ = X/ - X\~\ AH/ = H/ - Ht 1 

and a typical element of the matrix A is a it . 

By differentiating the budget equation with respect to Hj, we see that 
the element of the matrix A satisfies the identity 

X Pi°ti = 0. 7 = 1. 2,..., n. (3.5) 

* 

The habit vector H is assumed to follow the difference equation 

//'-(/- h) if*- 1 + A'*- 1 , (3.6) 

where / is the identity matrix of order n, and h is a. diagonal matrix whose 
typical element is hi , 0 < h t < 1. 

By taking first differences of (3.6), we get 

AW = (I-h) AW- 1 4- AX'- 1 . (3.7) 

Substituting for JA' 1-1 from (3.4), we obtain 

AW = [(/ - h) 4- A] AH'- 1 . (3.8) 

In the continuous case, it is easily seen that the difference equation 
system (3.8) reduces to 


ti = {-h 4 - A)H. 


( 3 . 9 ) 



308 


AHMAD E. EL-SAFTY 


In the continuous case, the adaptation process is stable if the charac¬ 
teristic roots of (—h + A) all have negative real parts. In the discrete case, 
the adaptation process is stable if all the characteristic roots of (7 — h 4- A) 
are inside the unit circle. Stability of the adaptation process in the discrete 
case implies stability in the continuous case, since the characteristic roots of 
(—h + A) are one less than the characteristic roots of (I — h + A). 

Definition 3.1. Good i is said to be a gross “learning” substitute 
(complement) for good j if dXJdQj < (>) 0, i =£ j. 

Theorem 3.1. If for every j — 1, 2,..., n, all goods are either gross 
learning substitutes or gross learning complements for good j and h t Js lOjj , 
j = 1, 2,..., n with the strict inequality for at least one j, then the system (3.9) 
is stable. 

Proof. Since all goods are either gross learning substitutes or gross 
learning complements for commodity j, then all afs (i =4= j) have the same 
sign (note that a„ — (bXJoOj) • &/). Since units of measurements may be 
chosen so that pj — \,j — ], 2,..., n, then from (3.5) we have: 

a t> r X a ii - 

Thus, 

I I =■= Z I a ’j I 
1+} 

and since hj ^ 2a jj , then - hj + a h is negative and 

I 4- a }i I > ) a )} | = £ | a, } ) 

Thus the matrix (-- h 4- A) has a negative dominant diagonaland all its 
characteristic roots have negative real parts, which proves the theorem. 

Theorem 3.2. If all commodities are either gross learning substitutes 
or gross complements for good j and hj > 2 ! a j} |, then the system (3.8) is 
stable. 

Proof. As in Theorem (3.1), we have 


I a ii I — X I a U I' 

i 

Since h t > 2 j then 

1 1 — hj 4- Ojj 1 4- X I I ^ 1 4- I 2a jj | < 1 . 

i+i 



ADAPTIVE BEHAVIOR 


309 


Thus under the conditions of the theorem, all column sums of absolute 
values of the matrix (I — h + A) are less than one and the matrix 
{/ — h + A) has no characteristic root of modulus exceeding unity, 
proving the theorem. 

Here we find that if is negative, its absolute value may not exceed hJ2 
for the system to be stable. In the continuous case there was no restriction 
imposed on a jf if it is negative. Theorem 3.1 illustrates very well the 
problem associated with the stability of systems having delayed and 
discontinuous response. 

At equilibrium, H { * = X*!h ( , i = 1, 2,..., n. Thus in the neighborhood 
of equilibrium, we have, from Eq. (3.4) and Eq. (3.7). 

AX t = (Ah- 1 ) AX 1 - 1 . 


The matrix A = Ah- 1 , plays a crucial role in the rest of the paper. We 
shall now investigate the characteristic roots of A in the special case when 
the a { j s satisfy the relation: 

® i j — Pib iv a,jl(\ Pjbjy), 

that is, when good j is purely habit forming, or when the utility function 
is additive in xft 1 . If the utility function is additive in tp*, then our strict 
quasi-concavity assumption implies that good j cannot be an inferior good. 
A typical element of the matrix (A — A/) is given by: 


where 


&j) A = (A, A) PtbjyXi, 


a it = Oath, and A, = a jt /(\ - p,Av)- 
In this case, the 7 th principal minor of the matrix (A — A I) is given by 


M(j) = 


(A x A) p x b lv \ MiA 

PiAA (A 2 A) P^b^v^i 

PibivK P-ibivK 


Pjbiv^i 

PibnAj 


(A A.) pjbj V Xj 


= (A, - A) • MU — l) — Pj - A, 

I (\ />AA PtblyXi ••• Ably 

Pibzv^i (A Pzbzv^-z Ab%y 

PtfriV^ 2 Abjy 


X 


— Plbjy\ 



310 


AHMAD E. EL-SAFTY 


By multiplying the last column by p k X k and adding it to the fcth column, 
we get 


i -1 


M(j) = (\j - A) • M(j - 1) - Pl b iy X, fl (A* - X). 

1 


Applying this recursive formula, we get: 

I A - XI | = n (A* - A) [l - i pAMX, - A))] = 0, 

which is the characteristic equation of the matrix A. 

If Qjj = 0, then X, = 0, and we see that A = 0 is a characteristic root. 
The case a jS = 0 is the case when the yth commodity does not experience 
learning changes, since in this case a tt = 0, i = 1For the sake 
of generality, we shall assume that only the first r commodities experience 
learning changes, that is a, , # 0, j — 1, 2,..., r, and r < n. Consider the 
function 


g(X)= 1 A,- A)). 

i 

Without loss of generality, we may renumber the commodities such that 
A* < A 2 < A* < ••• < A r . 

Also, assume that A, < 0 for j = 1, 2,..., R, 0 < R < r. Because the 
function g( ) is very special, we see that: 

g(Xr) = +QO, g(A,+) = — OO, j — it 2 ,..., R, 

g( 0 ) = 1 - £ Pi b fv > 0 , 

1 

g(Xj~) = -®, g(*] + ) = +00, j = R + 1. r. 

Thus, all characteristic roots of the matrix A are real and 

A min > min(0, Aj), 

A max ■<: A,. 

Thus, if J Xj | < 1,7 — 1 , 2,..., r, then the matrix A will have no character- 
istiC+oot exceeding unity in absolute value. 

Similarly if X s < l,j — 1 , 2,..,, r, then all the characteristic roots of the 
matrix (A — I) will be negative and all its principal minors will alternate 



ADAPTIVE BEHAVIOR 


311 


in sign. Thus all the principal minors of the matrix (—A + A) = ( A— /) A -1 
will alternate in sign also. We thus have 

Theorem 3.3. If the utility function is separable in ft, or, if the jth good 
is purely habit forming, then 

(A) If Sjj < 1 — fjbj,, then the adaptation process is stable in the 
continuous case. 

(B) If\*„\ < 1 — pjb jy , then the adaptation process is stable in the 
discrete case. 

Remarks. In the special case r — I, the nonzero characteristic root of 
the matrix A is a n and, accordingly, if 3 U < 1 then the adaptation process 
is stable in the continuous case. In the discrete case, the adaptation process 
is stable if | 3 U | < 1. 


4. Short Run and Long-Run Demand 

In this section we shall examine the properties of the long-run demand 
functions and their relation to the equilibrium short-run demand functions. 
We shall assume that for any set of prices and of income there 
exists an equilibrium demand vector X*(P, y) to which demand 
X(P, y\ H x , H 2 ,..., H r ) converges, if prices P and income y remain 
unchanged. 

Theorem 4.1. If the adaptation process is stable then the equilibrium 
demand vector X*(P, y) to which demand X(P, y\ H) converges is unique. 

Proof. By the continuity of the demand functions, the demand vector 
X*(P, y) must satisfy the equations: 

X,* = X t (P, y; XS/h ,,..., X r */h r ), i = 1, 2 . n,r^n. (4.1) 

Let X( (i = 1, 2,..., n) be any other point such that 

X< = X t (P, y, Xjh x , Xjh 2 ,..., X r /h r ), i = 1, 2 . n,r^n. 

(4.2) 

By the mean value theorem, we have: 

-X t *=t 5,/X, - X,*), i = 1, 2 . n, (4.3) 

j-i 


642/13/2-10 



312 


AHMAD E. EL-SAFTY 


which may be written in the matrix form: 

(A - I) • A = 0 (4.4) 

where A is the /i-component column vector whose itb element (X ( — X ( *). 

Since the adaptation process is stable, then (A — /) is nonsingular and 
the only solution to Eq. (4.4) is: A =0 which implies that = X,*, 
i = 1 , 2 ,..., n, proving the theorem. 

We may call X*(P, y), to which demand converges, the long-run demand 
functions, as opposed to the short-run demand functions X(P,y,pH). 
X*(P, y) is the solution to the system of equations 

<Pi ■ Vi — Xp { 0, i = 1, 2,..., n, (4.5) 

I PiXi - y = 0, (4.6) 

where the partial derivatives are evaluated at H, = Xjjhf , j = 1 , 2 ,..., r 
and <pi = 1, for i = r + 1,..., n. Note that X*(P, y) is not, in general, the 
solution to the problem: 


max[U(ip 1 (X 1 , 4> T (*r, &r(X r /h T )), X„ t . X n )}, 

s.t. J] p t Xi - y 0 . 


X*(P, y) is the solution to the above problem only in the special case r — n 
and; 

= W, i - 1, 2,..., n, (4.7) 

where A: is a pure constant. 

Since X*(P, y) does not, in general maximize U, what are the properties 
of these long run demand functions ? Is it possible that X*{P, y) maximizes 
another preference function, say, W, subject to the same constraint (i.e., 
subject to P'X* — y — 0)7 If so, then under what conditions? and what 
is the relation between £/(•) and W(-)l 
Using the equations 


2T,.* = X t (P, y; X 1 */h l ,..., X r */h T ) 

(4.8) 

= X f *(P, y), /= 1,2,..., n 

and introducing the notation b% for the partial derivative of X * with 
respect to price j and b* v for the partial derivative of X f * with respect to 
income, then by the mean value theorem, we may write for all /: 


dX* = fe* dp x + b * 2 dp 2 + + b*„ dp n + b* dy 

~ b n dp 2 + b (i dpt + ••• + b in dp n + b iv dy 

+ a n dx x * + a i2 dx 2 * + ••• + a ir dx *■, 



ADAPTIVE BEHAVIOR 


313 


thus, we have 

(/ - A) dX* = ( B b v ) • Q, (4.9) 

dX* = ( B* b*) [f y ], (4.10) 

where B and B* are the two n x n matrices [b ti \ and respectively, 
and b „, b y *, dX*, and dP are the n-element column vector whose typical 
elements are b iv , b* y , dX ( *, dp ,, respectively. 

Our stability assumption implies that the matrix (7 — A) is nonsingular, 
thus 


[(/ - A )-i B (7 - A)-' b y ] [f y ] = [ B* b v *] [f y ], (4.11) 

and since the vector (dP dy)' may be freely choosen, we then have: 


(I-A)-'B = B*, (4.12) 

(I — A)- 1 b v = b y *. (4.13) 

Thus, we see that the adaptation matrix (/ — A)' maps the equilibrium 
short-run price and income derivations to the corresponding long-run 
derivatives. 

We are now able to derive the slopes of the demand curves keeping real 
income constant. Denoting the slopes of the compensated demand curves 
by S u and 5,* , we have the Slutsky Equations 

S = B + b,X*\ (4.14) 

S* = B* + b*X*' (4.15) 

where S and S* are the two n X n matrices [<S M ] and [Sj$], respectively. 

Premultiplying both sides of Eq. (4.14) by (/ — A)~ l and using Eqs. (4.12) 
and (4.15), we get 

(7 — A )- 1 S — S*. (4.16) 

Thus, we also see that the adaptation matrix (7 — A)- 1 maps the 
equilibrium short-run compensated price elasticities into the corre¬ 
sponding long-run ones. Assuming that the short-run demand functions 
correspond to short-run convex and smooth indifference curves, so that the 
elasticities of substitution are defined everywhere, we see that the long-run 



314 


AHMAD E. EL-SAFTY 


indifference curves are smooth as well. These long-run indifference curves, 
however, may or may not be convex. 

Similarly, while the short-run indifference map is invariant under price 
changes, it is not necessarily true that the long-run indifference map is so 
invariant. It is of interest, therefore, to investigate the conditions under 
which a consistant set of long-run indifference curves exists. To do so, 
we shall first investigate the case r = 1 ; i.e., only the first commodity 
experience learning changes. 

In this case, the matrix (/ — A) is an elementary matrix and Eq. (4.13) 
reduces to: 


b* v - b iv - (bj(l - a u )) S a , / = 1, 2,..., n. (4.17) 

Since the equilibrium habit variable, H j*, satisfies the relation; //j* — 
XitK , which upon differentiation with respect to y gives 

dHS/By = (1 fhj[b lv + (dXJBH,*) ■ (dH^/dy)], 

which implies 


BH^/dy = b t J(l — 2 lt ), (4.18) 

which is positive if the first commodity is normal good (note that 
1 — 3 U > 0 by our stability assumption). The term b lv /(l — a n ) is, then, 
the change in the equilibrium habit strength variable due to a first-order 
changes in income mutatis mutandis. 

Similarly, from Eq. (4.12), we see that 

b* = b i} + (V(l - flu)) 4a , i &j = 1 , 2 . n. (4.19) 

Perhaps the most interesting consequence of Eqs. (4.19) is that some of 
the cross price derivatives in the short run may change sign in the long run. 
To illustrate: suppose, for example, that all goods are gross price 
substitutes to the first good. If, in addition, the first good is purely habit 
forming and a u > 0 , then all a n < 0 and we see that b% may be negative 
even if b 4i is positive. Here we find that learning has the effect of forming 
associations between commodities. Thus, we may safely assume that 
complementarity is a learning phenomenon rather than a general one. 
On the other hand, if there are short run connections or associations 
between two goods (b if < 0 , i # j), then learning strengthens these 
connections unless there are also short-run connections between good j 
and the first good (by < 0). 



ADAPTIVE BEHAVIOR 


315 


A similar relationship exists for the compensated price change. From 
(4.16), we obtain: 

S*J = S,j + (S l} /( 1 - On)) &«, i & j = 1, 2. n. (4.20) 

If the long-run demand functions came from utility maximization, then, 
among other things, the matrix S* must be symmetric. But from Eq. (4.20) 
we see that if S is symmetric, then S* is also symmetric if and only if 

a nSn — a n^n i i = 2,..., n. (4.21) 

We now show that if the adaptation process is stable, then the matrix S 
is negative semidefinite. From Eq. (4.20), we see that the y'th principal 
minor of the matrix S is 


M(j, s) = (i - a M ) 


J ia 

7* 


+ S i 


-a n S* n + S* -SnSZ + S* 


V* 


~a 21 S*j + Sv 
~&nSu + 


If we multiply the first row by a kl and add it to the Arth row, 
k = 2, 3we get: 


Mu,s) = (i -yuan 

Thus, if the adaptation process is stable, then (1 — d xl ) > 0, and we see 
that all the principal minors of both S and S* have the same sign, we now 
have 

Theorem 4.2. In the special case r — 1, if the adaptation process is 
stable, then the long-run demand functions can be rationalized by a utility 
function if and only if the short-run utility function be of the form 

U(-) = F($(Xi, B 1 ), Z(X t ,..., X n )). (4.22) 

Proof. The long-run Slutsky matrix is symmetric if and only if 
condition (4.21) holds. From Theorem (2.5), part (A), we see that 
condition (4.21) holds if and only if the short-run utility function is of the 
form (4.22). Also, if the adaptation process is stable, then the matrix S* 
is negative semidefinite. Thus according to the integrability theorem, a 
utility function which rationalizes the long-run demand functions exists, 
proving the theorem. 

We now turn to the general case 2 < r < n. From (4.16), we see that 



316 


AHMAD E. EL-SAFTY 


£(/ _ Jy = (/ - A) S*(I - Ay. Thus, if S is symmetric, S* is also 
symmetric if and only if (/ - A)S is symmetric. But (7 - A)S is symmetric 
if and only if 


£ &ikSk) — X GjkSki , i &. j — 1,2 n. (4.23) 

*-i i-i 

But Stic = {dXil2Qk)(®k'lh k ) and since the functions ©*(//*) may be 
choosen freely, then the matrix S* is symmetric if and only if 

fiikSki — Gjk$ki iScj — 1)2,...,/!, 

k = 1, 2,..., r. (4.24) 

Since 5 is also symmetric, then conditions (4.24) hold if and only if: 

= &kkS<k i — 1, 2,..., n, k = 1, 2.r. (4.25) 

Also, symmetry of both S and S* implies that S = (I — A) S* — S*(I — A)' 
and we see that AS* — S*A'. Also, (A)™ S* — S*(A') m . 

Now, if the adaptation process is stable, then all the principal minors of 
the matrix (/ — A) are positive and there exists a nonsingular matrix B 
such that (I — A) — BB. In fact the matrix B exists as an infinite poly¬ 
nomial in A. Thus we have: 


s = (/ - A) s* 

= BBS* 

= BS*B'. 

Thus, we see that S* is negative semidefinite if and only if the matrix S 
is negative semidefinite. We thus, have 

Theorem 4.3. In the general case, r > 1, if the adaptation process is 
stable, then the long-run demand functions can be rationalized by a utility 
function if and only if the short-run utility function be of the form 

t/(-) = F(^(A' 1 , ©0 + , @ 2 ) + - + <f,'(X r , <9 r ) + Z(X r+1 XJ). 

(4.26) 

Proof The proof follows immediately from Theorem 2.3, part (B) 
and the integrability theorem. In this case, the long-run utility function is 
easily seen to exist in the form 

W{-) = kfXy, + k 2 (X 2 ) + - + k r (X r ) + Z(X r+1 ,..., X„), (4.27) 



ADAPTIVE BEHAVIOR. 


317 


where 

k^x ( ) = J &*(*<, SXXJK)) dX<, l r 1,2.r, (4.28) 

that is 

k t ' = <hXXi , ©<(*./*,)), i = l, 2.r. (4.29) 

And, 

A:/ = + rf, • &i/hi , / = 1, 2,..., r. (4.30) 

From the first-order conditions <f>^ — Xp ( , i = 1, 2,..., r, we see that 

PiifinOu + <f>n 8 t') = Pi • <f>n ' a,; * =./, (4.31) 

Pi ■ 4\ib tv = />, • <An • i &7 = 1, 2.r. (4.32) 

From (4.30), (4.31), (4.32), and Corollary (2.3), we see that 

*r = (1 - A,) & , (4.33) 

where A, = 2 {l /( l — p t b u ), which is less than one if the adaptation process 
is stable. Thus k\ and <j>[ x will have the same sign. 


5. Concluding Remarks 

If the short-run utility function is not of the form (4.26), then the 
long-run demand functions can be rationalized by the preference function 

fV(-) == W(Xy , X 2 ,..., X n \Px,Pi . Pn , y)- Nothing, in general, can be 

said about this function other than that it is homogenous of degree zero 
in (px , p z p n , y). This is because the long-run demand functions are 
so homogeneous. Stability of the adaptation process, together with the 
strict quasi-concavity of the short-run utility function, implies that the 
function W(-) is also strictly quasi-concave in (X x , X 2 Af n ). 


References 

1. A. E. El-Safty, Adaptive behaviour and the pure theory of consumer demand, 
Ph. D. Thesis, MIT, 1972. 

2. F. M. Fisher and K. Shell, Tastes and quality change in the pure theory of the true 
cost-of-living index, in “Value, Capital and Growth: Papers in Honor of Sir John 
Hicks” (J. N. Wolfe, Ed.). Edinburgh University Press, Edinburgh, 1968. 



318 


AHMAD E. EL-SAFXY 


3. H. S. Houthakker and L. D, Taylor (Eds.), “Consumer Demand in the U.S.: 
Analysis and Projections,” Harvard Univ. Press, Cambridge, Mass., 1970. 

4. W. W. Leontief, Introduction to a theory of the internal structure of functional 
relationship, Econometrica 15 (1947), 361-373. 

5. R. A. Pollak, Habit formation and dynamic demand functions, J. Political Econ. 
78 (1970), 745-764. 

6. C. C. von WeizsAcker, Notes on endogenous change of tastes, /. Econ. Theory 3 
(1971), 345-372. 



JOURNAL OF ECONOMIC THEORY 13, 319-328 (1976) 


Adaptive Behavior and the Existence of Weizsacker’s 
Long-Run Indifference Curves* 

Ahmad E. El-Safty 

Cairo University, Cairo, Egypt and Eastern Michigan University, 
Ypsilanti, Michigan 48197 

Received February 22,1973 


Suppose that past consumptions of the first r commodities (r < ri) influence 
present consumption. Then, the long-run demand function to which demand 
converges maximizes the equilibrium short-run utility function only under very 
restrictive conditions. The long-run demand functions can be rationalized by a 
utility function, different from the equilibrium short-run utility function, if and 
only if the short-run utility function is such that past consumptions of any good 
that experience, learning, or taste changes is separable from all other goods. 
The class of such utility functions has been found. 


1. Introduction 

In a recent paper which appeared in this Journal, Weizsacker [8] 
presented a dynamic demand model in which past consumption influences 
present consumption. Weizsacker’s main interest in this phenomenon 
centers around its welfare implications. His entire analysis, however, rests 
on the claim that, if the short-run demand functions correspond to smooth 
and convex indifference curves then the long-run demand functions 
correspond to smooth and convex indifference curves as well. This paper 
shows that Weizsacker’s claim is not valid in general and establishes 
necessary and sufficient conditions under which the long-run indifference 
curves exist in the quantity space. 

In order to explore a wide variety of interesting cases, I found it 
desirable and, perhaps, more illuminating to reformulate Weizsacker’s 
dynamic demand model in terms of the widely accepted notion of “learning 

* I am indebted to the Associate Editor, R. Eckaus, R. Solw, and F. M. Fisher for 
their advice and helpful comments on a number of points. I am also grateful to the 
Department of Economics of the Eastern Michigan University and to the members 
of the Methods of Demographic Projection Analysis Project (D.P.A.) at the American 
University in Cairo for their support. I bear full responsibility for the results. 

319 

Copyright © 1976 by Academic Press, Inc. 

All rights of reproduction in any form reserved. 



320 


AHMAD E. EL-SAFTY 


by doing.” The present formulation involves no new assumptions. 
Furthermore, influences from past periods other than the last period are 
allowed. 


2. The Dynamic Demand Model 

Weizsacker’s dynamic demand model, and that of Houthakker and 
Taylor [3], express the generally accepted idea that current decisions are 
influenced by past behavior. The effect of past behavior is assumed to be 
represented entirely by the current values of the discounted sum of past 
consumptions. In the w-commodity framework, the demand functions take 
the form: 

Xi ' = xSipi.Pz,...,p„,y; H x \ H,,*,..., H r ‘), i = 1, 2 ,..., n, (1) 

where y is income and P is the price vector. ///, j — 1,2,..., r, is the habit 
or state variable for good j at time t and is given by the difference equation: 

/// = (1 - hi) H]- 1 + x)~\ i = l, 2,..., r. (2) 

For the sake of generality, I assume that only past consumptions of the 
first r commodities ( r < ri) influence present consumption. In Weizsacker’s 
formulation, n — r = 2 and A, = 1 , i — 1 , 2. 

If the commodity is a durable good, such as clothing, then H,‘ is the 
stock of the physical good at time t and h t is its rate of depreciation. 
If the good is habit forming, say, tobacco, then Hi may be interpreted as 
the psychological stock of the smoking habit. 1 If commodity i experiences 
learning changes, that is, if one unit of the commodity in a later period 
provides as much services as more than one unit in an earlier period, 2 
then H t serves as a learning index and h t as the rate of forgetfulness. 

Following learning theory terminology, H* is the current effective 
number of reinforced trials practicing task j (consuming good j). H s , 

j — 1,2.r, are then associative variables that connect the response 

strength (consumption vector) to the stimulus 3 (preferences, income, and 

1 See, for example, Houthakker and Taylor [3]. 

* This kind of learning has been realized by many writers, Fisher and Shell [2], for 
example, write “it may be known to be the case that a recently introduced electrical 
appliance increases monotonically in desirability through time during the period in 
which consumers are learning about the usefulness of the appliance in a latter year 
may afford the same services as more than one in an earlier year but with no physical 
change in the good itself... To take a slightly different idealized case, suppose the 
consumers... learn to use a certain fuel more efficiently, getting a certain number of 
BTU’s of a smaller quantity of fuel.” 

* See, for example, Hull [4] and McGooch [7]. 



ADAPTIVE BEHAVIOR AND INDIFFERENCE 


321 


prices). The greater the habit established for a response i (consumption 
of good i), the more effectively that response could compete with the 
alternative responses. The magnitude of the differentiated response 
depends on the relative values (relative marginal benefits) of the response 
to the other responses. The reaction potential of the response i (desirability 
of current purchases of commodity i) is directly correlated with the habit 
strength variable, H t , and some other intervening variables like incentive, 
motivation, and stimulus-intensity dynamism. In this paper, I shall assume 
that the reaction potential of response i, S { , depends only on the habit 
strength Hi. 

Translating this into economic terms, the stimulus is the set [objective, 
preferences, income, prices] and the response is the consumption vector. 
Different stimuli evoke different responses. Let U(x 1 , x 2 ,..., x n ; 
OiiHx ), <9 2 (// 2 ),.,., & r (H r )) be a monotonic and strictly quasi-concave 
function of ;ti, jc 2 ,..., x„ for all relevent values of 
Assuming that the objective of the consumer is to maximize U subject 
to the budget constraint, then the demand functions satisfy the first order 
conditions: 

U,(x i‘, V; <9iW), & r (H r 1 )) = X ' Pi , t = 1 w> (3) 

X p* x ** = y> ( 4 ) 

where U { denotes the partial derivative of U with respect to its z'th argument 
and A* is the current Lagrange multiplier. 

We shall assume that, for any given set of prices and of income, there 
exists a unique equilibrium demand vector, X*(P, y), to which demand 
X(P,y\H) converges, if prices P and income y remain unchanged. 

The equilibrium habit variables are given by H ( * = , i = 1, 2.r, 

where xi* is the equilibrium purchase of good i and is given by: 


Xi*(P, y) = x t (P,y, ^(x^/Ai), & 2 (x 2 */fi i ) . @ r (x T */h r )), 

i = 1, 2,..., n. (5) 

X*(P, y) are the long-run demand functions as opposed to the short-run 
demand functions X(P, y\ H). 

The long-run demand functions X*{P,y) satisfy the first order con¬ 
ditions: 


^<(xix„*; e^x^/hi),..., @ r (x r *lh r )) = A *p t , 

I PiXi* = y- 


i = 1 , 2 ,..., n, (6) 

( 7 ) 



322 


AHMAD E. EL-SAFTY 


Note that X*(P,y) does not, in general, maximize the equilibrium 
short-run utility function: 

U*(-) = U*(Xi , x 2 x„) 

— U(x t , A'o x„ ; &i(X}/hi),..., O r (x r lh T )). 

X*(P, y) maximizes U* only if 

u* = U t + U n+i • Oi/ht 

(o) 

— y>(xi , ,..., x n ) <7,, i 1, 2,..., n, 

where ^(jq , a 2 .x„) is an integrating factor. 

Clearly, if the short-run utility function satisfies condition (8), then the 
long-run demand function satisfies the Slutsky symmetry condition. The 
long-run Slutsky matrix, however, may not be negative semidefinite. 
El-Safty [1] shows that if the long-run Slutsky matrix is symmetric, it is 
negative semidefinite if and only if the adaptation process is stable. We 
now have: 

Theorem 1. If the adaptation process is stable everywhere, then the 
long-run demand functions uniquely maximize the equilibrium short-run 
utility function if and only if. 

6V ■ U n+i = hMx^x n )Ui, i = 1, 2. n. (C.l) 

Condition (C.l) is very restrictive if only because it does not allow for 
the case when only some commodities experience learning changes 
(the case n r). Furthermore, if condition (C.l) is satisfied, then the 
long-run indifference map coincides with the equilibrium short-run 
indifference map. In such case, the compensated slopes of demand in the 
long-run are identical to those of the equilibrium short-run and 
Weizsacker’s welfare analysis will break down. This will be the case if, 
for example, the short-run utility function takes the form: 

U( ) = Z L (*> ~ a Hi){Xj — txHj) and h, = h, i — 1, 2,..., n. 

i i 

The interesting case, however, is when the long-run indifference map 
exists and differs from the equilibrium short-run indifference map. For 
this to be the case, then, there must exist an increasing twice differentiable 
and strictly quasi-concave function fV(x x , x 2 ,..., x n ) such that for all 
values of Xi , x t jc„ , we must have: 

w i = » *2 X n ) U t , i = 1, 2,..., n. 


(9) 



ADAPTIVE BEHAVIOR AND INDIFFERENCE 


323 


where j*(xi, x 2 ,..., *«) is an integrating factor. Symmetry of the cross 
* partial derivatives of both W and U implies 

. /*,£/, - H U t = • 9/lh, - £W • 6 t 'lh t ), 

i=*j> Uj = 1, 2 ,..., n, (10) 

where 6 k ' = 0 for k — r + 1,..., n. Replacing by exp(/x), Eq. (10) 
reduces to 


mu, - = Ut^e/ih, - u t . n+i 6/ih ( , 

i ¥=j, i,j = 1, 2 ,..., n. (11) 

For i — s, s — 2,..., r, we have 

PM - fiiU, = U,. n+i ■ 6,'lh, - • ©.'/A,, 

5 = 1,2,..., r, j — 1, 2.«, j. (12) 

For j — s, s = 1, 2,..., r, we have 

futf. ~ H.U< = £/<.„+, • ©.'/A, - t/.. n+( ■ ©//A,, 

5 = 1, 2,..., r, i = 1, 2.n, j # 5. (13) 

Multiplying Eq. (12) by U t and Eq. (13) by U,, subtracting, and using 
Eq. (11), we get: 

(tw. n+ , - U { U,. n+] ) ©//A, + (UjU s , n+ i - U,U Ln+i ) 9 t 'lh t 
+ (UiU^, - UjU ittt+ ,) 6,'lh, = 0, 

5 = 1,2,..., r, i&j = 1, 2,..., n, i #; # s. (14) 

Thus, for the long-run demand functions to be rationalized by a utility 
function, the short-run utility function must be such that, at equilibrium, 
Equations (14) hold as identities in X. The taste or learning parameters, 
©<', i = 1,2,..., r, are not assumed to have any particular properties. 
Thus, for the function (/(•) to satisfy (14) for any choice of ©', 

Equations (14) must also hold as identities in ©/, i = 1, 2. r. 

Accordingly, the function (/(•) must be such that: 

U,U] tn+ , = 0, i ^ j ^ 5, 

s = 1. 2.r, i,j = 1, 2,..., n. (15) 

If the function U(-) is such that (14) holds but not (15), then (14) uniquely 
determines ©/, i = 1, 2 ,..., r (up to a scaler multiple), and the function 
1/(0 cannot satisfy (14) for any other choice of ©'. This case occurs, 



324 


AHMAD E. EL-SAFTY 


for example, when n — r and the function £/(•) and all 8/ are such that 
condition (C.l) is satisfied. 

We shall now investigate the implications of condition (15), and show 
that it is also sufficient. We shall first investigate the case r > 1. The 
special case r = 1 is treated separately. 

In order to avoid reference to special cases, the following definition is 
used: W(X, 9) is an ^-transformation of U(X, 6) if and only if the 
function U can be written in the form 

U(X, 8) = F( W(X, 6), 8). 

Clearly, the functions U and W will produce the same short-run demand 
functions, provided F x > 0. Also, U satisfies (15) if and only if W does. 

Theorem 2. If the adaptation process is stable everywhere, then the 
long-run demand functions can be rationalized by a utility function for any 
choice of & if and only if, for each ©, there exists an X-transformation 
W(-, 0) of Uf, 8) in the form : 

W f(x 1 , x 2 ,..., x„) + r ! 1 (x 1 , 6>j) H-+ rj r (x r , e T ). (C.2) 

Proof. Necessity. Equations (15) imply: 

WBMm = 0, ifj^s. 

Thus, Ut/U, is independent of all &'s, except possibly 8 , and 8, . That is,- 

Ui/Ui = F<\x x , x z x n ;8 { , 8J, i f j. 

But since U ( /U k , i ^ j =£ k, is independent of 8 S and is inde¬ 

pendent of ©,, the function F ii must be of the form: 

F' 1 = f i (x 1 , x 2 ,..., ; 8 ( )/ft( Xl , x 2 ,..., x„ ; &J. 

Thus, the function U must be such that 

U { = K{x k x n ; ©i 8 r ) , x t ; @<), K 0. 

Symmetry of the cross partial derivatives of U imply: 

- 4/ = (K4< - KrfOIK. (16) 

But since the right hand side of (16) is independent of 8 k , k tF i # j, 
then, 

(a) <f>f — <f>i = A } <f> { — Krf 1 = 0, i # j, or, 

(b) K ( Xl ; 8 1 @ r ) = K\ Xl .jc„) K\8 2 . 8 r ). 



ADAPTIVE BEHAVIOR AND INDIFFERENCE 


325 


If (a) holds, then <f> t * = i # 7 , and, since U ( has continuous first 
derivatives, then, according to the classical version of Frobenius’ theorem, 
there exists a function W such that 

W { — x n ; ©,), i = 1, 2,..., n. (17) 


Thus, 


£/(•, 0) = F(W(-, ©), 9), where F 1 = K. 

From (17) we see that fV iin+ , = 0, s i= i, which implies that the function 
W exists in the form (C.2), which completes the proof. If (b) holds, then 

U t = A ' 2 • 4,\ Xi x„ ; ©<), where ^ = Ktp. 

Again, symmetry of the cross partial derivatives of U imply that ifij = ifif, 
and the proof is completed as in (a). 

Sufficiency. The long-run demand functions satisfy the first order 
condition: 


V(*.. + /< = Ap,, 


J = 1 , 2 ,..., r, 


fi = Ap,, j=r+I.», 

X />.*. = >’• 


Consider the problem: 

maximize Z = f(x x *„) + fVt) + ••• + £ r (x r ), 
subject to X P< x x = 7. 


where, 

£*(•*.) = J ^'(x,, ©„(*»//?,) </x,. 

Clearly, the long-run demand functions satisfy the first order condition 
of the above maximization problem. The second order condition is also 
satisfied in view of our stability assumption. This completes the proof 
of the theorem. 

Note 1. If the short-run utility function is defined everywhere, then 
the long-run utility function is also defined everywhere. This observation 
is crucial to Weizsacker’s welfare analysis which assumes that there exists 
a long-run indifference curve passing through any given point in the 
commodity space (see Note 2). 



326 


AHMAD E. EL-SAFTY 


We now turn to the special case r = 1, when only the first commodity 
experiences learning or taste changes. This is the case of a newly introduced 
good to the individual’s consumption pattern. 

Theorem 3. In the special case r — 1 , if the adaptation process is stable 
everywhere, then the long-run demand functions can be rationalized by a 
utility function if and only if the short-run utility function be of the form: 

U(-) = Uix,, 0 t , FOi, x 2 ,..., x n )). (C.3) 

Proof. Necessity. In this case, condition (15) implies: 

(d/dGiWi/U,] = 0, i*j*\. (18) 

Condition (18) is equivalent to (C.3) by a well-known theorem of 
Leontif [ 6 ). 

Sufficiency. The long-run demand functions are related to the 
equilibrium short-run demand functions by: 

Xi*(P, y ) = x<{P, y; @i(x x *jhi)), i = 1, 2,..., n. (19) 
Differentiating (19) partially with respect to y, we get: 


b{ y — b( v 8 ■ fl,j • b ly , i — 1, 2,..., n, 


( 20 ) 


where, 

A* - SX <* 


b{u — 


8xi 


® X ' ■ &, and 8 = ^ 


8y ’ - 8y ' ~ ~8G~ 1 

Similarly, differentiating (19) with respect to p,, we get: 


(hi - a u ) ' 


b * = bu + 8 ■ a a • b ly , i Stj = 1, 2. n. 


( 21 ) 


where, 


_ <>X t * dx t 

'* ~ 1$; and bii ~~Bfr 


b% = 


Denoting the slopes of the compensated demand curves by S {j and S* } , 
we have the Slutsky Equations: 


Su — bn + x,*b iy i Scj — 1, 2,..., n; 

Su — b*j -f- Xj*b* y i & j = 1, 2,..., n. 


( 22 ) 

(23) 



ADAPTIVE BEHAVIOR AND INDIFFERENCE 


327 


Using Eqs. (20), (21), (22), and (23), we get 


S*j — S{] + S • a fl • S xi . , (24) 

Since the short-run demand functions are derived from a utility maxi¬ 
mization, then Sn = S H , and from equation (24) we see that the long-run 
demand functions satisfy the Slutsky symmetry condition if and only if: 

a n S u = a n S u i&j = 1, 2,..., n. (25) 

We now show that the long-run demand functions satisfy the Slutsky 
symmetry condition. 

The short-run demand functions satisfy the first order conditions, 


U x + U 3 F x = \ Pl (26) 

U,F. = \ Pl , ; =■ 2 ,..., n (27) 

X P‘ x t ^ y- 

By eliminating A and U 3 from (27) and using (28), we see that (jc 2 , x 3 x n ) 
can be solved in terms of (x x ,y — p x x x ,p 2 ,...,p„). Thus, we have: 

= /’(*!. y - PiX i, Pi i ^ 1. (29) 


By differentiating (29) with respect to y and />, and using the Slutksy 
equation, we get 


S. l =(f l ‘-pJ, t )S a . (30) 

Differentiating (29) with respect to 0 X , we get 

= (/i* - PJS) a n ■ (31) 

From (30) and (31), we see that Eqs. (25) are satisfied. Thus the long-run 
Slutksy matrix is symmetric, and, in view of our stability assumption, 
it is negative semidefinite. Thus, according to the integrability theorem, 
smooth and convex indifference curves exist in the quantity space in the 
neighborhood of all possible equilibrium points. 

Note 2. If the short-run utility function defined in (C.3) is defined 
everywhere, then the long-run utility function exists but may not be 
defined everywhere unless every point in the commodity space is a possible 
equilibrium point, which I am not able to show at the moment. In such case 
Weizsacker’s welfare analysis should be used with some caution. 


648/13/2-11 



328 


AHMAD E. EL-SAFTY 


Concluding Remark 

Conditions (C.l), (C.2), and (C.3) are necessary and sufficient con¬ 
ditions for the long-run indifference curves to exist in the quantity space. 
These long-run indifference curves are convex if and only if the adaptation 
process is stable. Accordingly, the question of stability of the adaptation 
process can be easily answered by examining the quasi-concavity of the 
long-run utility function. 


References 

1. A. E. El-Safty, Adaptive Behavior, Demand and Preferences, J. Eeon. Theory 
13(1976), 298-318. 

2. F. M. Fisher and K. Shall, Tastes and quality change in the pure theory of the 
true cost-of-living index, in 3. N. Wolfe (ed.) “Value, Capital and Growth," papers 
in honour of Sir John Hicks, Edinburgh University Press. 

3. H. S. Houthakker and L. D. Taylor, “Consumer Demand in the U.S. Analysis 
and Projections,” Harvard Univ. Press, 1970. 

4. C. L. Hull, Principles of behaviour, in “An Introduction to Behaviour Theory," 
Appleton-Century, New York, 1943. 

5. L. Hurwicz, On the problem of integrability of demand functions, in “Preferences, 
Utility, and Demand," (J. S. Chipman, el al. Eds.) Harcourt Brace Jovanovich, 
H. B. Jovanovich, Inc., New York. 

6. W. W. Leontif, Introduction to a theory of internal structure of functional relation¬ 
ship, Economelrka, 15 (1947), pp. 361-73. 

7. J. A. McGeoch and A. L. Irion, The Psychology of Human Learnings , 2nd ed., 
Longmans, Geen, New York. 

8. C. C. von Weizsacker, Notes on endogenous change of tastes, J. Econ. Theory 3 
(1971), 345-372. 



journal of economic theory 13, 329-340 (1976) 


Endogenous Tastes and Stable Long-Run Choice* 

Peter J. Hammond 

University of Essex, Colchester, C04 3SQ, England 
Received June 10, 1975 


Suppose that short-run preferences depend upon consumption one period 
earlier. Then there is an acyclic long-run strict preference relation iff, for every 
finite set, every conservative choice sequence converges. If long-run preferences 
are acyclic, then a unique long-run choice from a compact set is globally stable. 
If the long-run choice set includes multiple choices, there is a weaker stability 
property. Under special assumptions these results are extended to cases when 
the short-run consumption set is endogenous, and when more previous periods 
affect the present. 


Some recent papers [2, 5, 6, 8] have investigated the existence of a 
long-run utility function when tasted depend upon previous consumption. 
In Poliak’s papers, the consumption set, too, depends upon previous 
consumption. Here, I shall show the relationship between the acyclicity 
of long-run preferences, the uniqueness of long-run choice, and the global 
stability of long-run choice. 

Section 1 contains necessary preliminaries. Section 2 proves three 
stability theorems. Section 3 shows how the results can be extended to 
endogenous consumption sets. Section 4 shows how the results can be 
modified when short-run utility depends upon weighted averages of past 
consumption. Section 5 is a discussion of the results of the eariler sections. 
The final section, Section 6, looks at the implications of the results for the 
theory of long-run demand. In particular, it shows how Poliak’s stability 
theorems can be strengthened. 

1. Endogenous Tastes and Acyclic Long-Run Preferences 

Consider a consumer making a series of choices of consumption vectors. 
Suppose he faces a fixed consumption set X. X need not be convex, and 
may include indivisibilities. Assume the consumer has endogenous tastes, 

* I am grateful to the referee, the Associate Editor, and Avner Shaked for fMpv 
perceptive and helpful comments, and to Christian von Weizs&cker and James Mirrlees 
for arousing my interest in the subject. 

329 

Copyrighr © 1976 by Academic Pres*, Inc. 

All right* of reproduction in any form reserved. 



330 


PETER J. HAMMOND 


so that his tastes in period t are independent of t, but do depend upon 
consumption in period t — 1. Specifically, if x f _i is consumption in 
period / — 1, suppose that tastes in period t are described by the short-run 
utility function «(*,_,, •). which is a function u(x t _ y , x t ) of consumption 
in period I, x t . Assume that u(-, •) is defined over the whole set X x X, 
and is continuous in both variables together. 

Corresponding to u(x, ■), for each * e X, is a choice function C(x)f) 
defined on subsets of X as follows: 

C(x)(A) — { yeA\zeA => u(x, y) > u(x, z)} (all A Q X). 

C(x)(A) is to be interpreted as the set of options which the consumer is 
willing to choose at time t, given that x was his choice at time t — 1. 

Define a long-run choice function C(-) on X as follows: 

C(A) ={xe A \xs C(x)(A)}. 

Of course, C(A) may be empty. 1 But, if x e C(A), and if the consumer has 
chosen x once, then he is willing to choose x repeatedly thereafter. 

Define the binary long-run preference relation P on X as follows: 

xPy iff u(y, x) > u(y, y). 

So x P y iff, when y was chosen previously, x is preferred to y in the 
short-run. 

Now C(A) — {x e A \ y P x y$ A}. 

The transitive completion of P is the binary relation P* on X defined by : 

x P* y iff there is a finite sequence Zj, z 2 ,..., z„ of points of X such that 
xPz 1 ,z l Pz i ,..., z n _ x Pz n ,z n P y. 

P is acyclic iff P* is irreflexive, i.e., there is no x e X such that x P* x. 
A standard result is that P is acyclic iff C(A) is nonempty whenever A is 
finite. 

Given AQX, a choice sequence from A is an infinite sequence of points 
(*/) (t = 0, 1, 2,...) in A such that, for each t, x t+1 e C(x t )(A). 

1 James McIntosh has pointed out to me that, in an important class of cases, C(A) 
is not empty. Suppose that (i) A is compact and convex, (ii) for each x e X, u(x, ■) is 
quasi-concave (as well as continuous in both variables). Then the correspondence 
x C(x) (A) is upper semi-continuous and convex valued. 

C(A) is the set of fixed points, which is nonempty, by Kakutani’s Theorem (cf. 
Theorem 2 of [1]). But, unless long-run preferences, as defined below, are acyclic this 
C(-) may be an ill-behaved choice function —- in particular, it may violate all the axioms 
of revealed preference. For properties of such choice functions in demand theory, 
see Sonnenschein [7J and Mas-Coleii f4J. 



ENbOGENOUS TASTES AND CHOICE 


331 


A choice sequence from A is conservative if: 

x t £ C( x t)(A) => x t+ i = Xi , ■ 

which means that, at time t + 1 , the consumer does not change his choice 
from x t unless he is not then prepared to choose x t at all.* 

Notice that if (x t ) is a conservative choice sequence, and if x m # x t , 
then, because x t e A, x (+1 P x,. 

On the other hand, if ( x t ) is a conservative choice sequence and 
x t e C(Xt)(A), then, for all s > t, x, = x t . 


2 . Three Stability Theorems 


Theorem 1. P is acyclic iff ,\ for any finite set A, any conservative choice 
sequence from A converges. 

Proof (1) Suppose P is acyclic. Let A be any finite set. Let (x t ) 
(t = 0, 1, 2,...) be any conservative choice sequence from A. 

Case (i). For some t, x t+1 -- x t . Then, conservatism implies that 
x s = x t for all s > t, and so the sequence converges. 

Case (ii). For all t, x (+1 ^ x t . Then, because of conservatism, for all t, 
x , +1 P x t . Thus, if s > t, then x, P* x, . But, because A is finite, there 
exists s and t such that s > t and x, ~ x,. So x t P* x t , contradicting 
acyclicity. So case (ii) cannot occur. 

(2) Conversely, suppose x 0 P* x 0 . In fact, suppose: 


x 0 P x n , P x n _ x , P x n _ 2 ,..., x t P x\ , x x P .v 0 . 


Let A — {x 0 , Xj,..., x„}. Define y 0 = x 0 . For t — 1,2. define y t 

recursively so that (j,) (t = 0 , 1 , 2 ,...) is a conservative choice sequence 
from A. Suppose y t -*■ y* as t —► cx). Then y* e A, because A is finite, and 
there exists s such that t ^ s implies y t = y, = y*. Suppose y, — x r . 
Then, for some q, x„ P x T , and x, e A. So x r ^ C(x r )(A), and so ^ y,. 
This is the required contradiction. 


Theorem 2. Suppose that : 

(i) A is a compact subset of X\ 


* Cf. the definiton of conservatism in [1], 



332 


PETER J. HAMMOND 


(ii) P is acyclic ; 

(iii) (x t ) (t — 0, 1, 2,...) is a conservative choice sequence from A. 

Then any cluster point of sequence (x t ) is a long-run choice from A. 

Proof. Suppose x* is a cluster point, and x (( „> -► x* as n -*■ oo. Then 
x* e A, because A is closed. Because A is compact, the sequence (x tM+l ) 
(n — 0, 1, 2,...) also has a cluster point in A. Let this point be x. Assume 
that the subsequence t(n) has been chosen so that x ( <„) +1 —► x as n —*• oo. 

Case (i). For some t, x t e C(x t )(A). Then, because the sequence is 
conservative, for all s > t, x, = x, , and so x s = x*. Then x* is the only 
cluster point, and it is a long-run choice from A. 

Case (ii). For every t, x, $ C(x ( )(A). Then, for every t, x , +1 P x t . So, 
if s > t, x, P * x,. Since P is acyclic, x, P x* is false, and so u(x a , x<) < 
w(x,, x,). 

Therefore, whenever t(n‘) > t(n) + I, 

u{Xt(n') , Xi(„) + i) ^ U(X(( n ') , Xj( n ')). 

Taking limits as n' -*■ oo, 

«(•**, x tM+l ) w(x*, x*). 

Taking the limit as n -* oo, 

u(x*, x) ^ w(x*. x*). ( 1 ) 

Now, X|(„) +1 6 C(x, (n ))(/4). So, for any ye A: 

«(x-|(b) ’ X t („) +1 ). 

Taking the limit as n -* oo: 

«(x*..y) < «(x*, x). 

From (1), it follows that u(x*,y) < k(x*, x*). Therefore, if y P x*, then 
y $ A; so x* is a long-run choice, as required. 

Theorem 3. Suppose that 

(i) A is a compact subset of X\ 

(ii) P is acyclic. 

Then x* is the unique long-run choice from A iff every conservative choice 
sequence from A converges to x*. 



ENDOGENOUS TASTES AND CHOICE 


333 


Proof. (1) (a) If any conservative choice sequence from A, (x t ), 
converges to x*. x* is a cluster point of (x ( ), and so, by Theorem 2, x* is 
a long-run choice from A. , 

(b) If x is a long-run choice from A, then the sequence (x t ) with 
x, = x (each t) is a conservative choice sequence from A. So, if it converges 
to x*, x = x*; it follows that x* is the unique long-run choice from A. 

(2) Conversely suppose x* is the unique long-run choice from A. 
Let ( x t ) be any conservative choice sequence from A. By Theorem 2, 
any cluster point of (x,) is a long-run choice from A. So x* is the only 
possible cluster point of (x,). But, since A is compact, (x,) must have a 
cluster point. So x* is the unique cluster point of (x ( ), and so x, -► x* as 
t 00. 


3. Endogenous Consumption Sets 

In [8], Weizsacker had a fixed consumption set X. In [5] and [6], on the 
other hand, Poliak has effectively made the consumption set, as well as 
preferences, endogenous. The domain of definition of his short-run utility 
functions is a set X(x) which depends on x, consumption in the previous 
period.® 

Let X, a subset of ^-dimensional Euclidean space, be the exogenous 
consumption set. For x e X, let X(x) be the endogenous consumption set 
for period t, when consumption in period t — 1 was x. Assume that 
X(x)C X for each x e X. 

Despite having linear adjustment equations, Poliak was only able to 
claim local stability of long-run choice, because of the need to satisfy 
“nonnegativity and regularity conditions.” i.e., the need to stay within the 
endogenous consumption sets. 1 Here, two extra assumptions are made, 
which overcome this problem. 

Consider the set {x}, for any single option x e X. For long-run choice 
from {x} to be possible, one clearly needs to have x e X(x), otherwise, 
after x has been chosen once, if only {x} is still feasible, the choice set 
is empty. 

This explains the first extra assumption, which is: 

For all x e X, x e X(x). 

* See, for example, p. 749 of [5]. 

* See {5, pp. 156-151]. It is tempting to regard X as representing physiological needs, 
and X(x) as psychological needs (cf. [5, p. 749]). But some physiological needs are 
endogenous, e.g., rest after extra physical exertion. 



334 


PETER J. HAMMOND 


Given this assumption, and one other to be explained shortly, the 
problem of endogenous consumption sets can be handled by extending, 
for each x e X, the domain of u(x, ■) from X(x) to the whole of X. To do 
this, define: 


__ (exp[«(x, >>)} (if y e X(x)) 

U{X ’ y) ~ inf ( {exp[«(x, 2 )]} (if v i X{x)) 

x teX(sc) 

Even if u(x, z) -* — oo as z tends to a boundary point of A"(x), this function 
u(\ ■) is well defined. 

The final assumption I shall make is that i/(-, •) is a continuous function 
on X X X. This is true, for example, if there is a constant a (which may 
be — oo) such that, as z tends to any boundary point of X(x), u(x, z) tends 
to a, and u(x, z) ^ a for all z e X(x). s 

Given this extra assumption, one can consider the function u(\ ■) 
defined on the whole of I X J, and proceed just as in Sections 1 and 2, 
to relate long-run preferences, unique long-run choice and global stability. 
But these long-run preferences and long-run choices must be related to 
choice, etc., in the proper sense, when the constraints imposed by the 
endogenous consumption sets have been recognized. 

Define the relations P, P and long-run choice functions, C, C as follows: 

xPy ow(y, x) > u(y,y) (and x e X(y)), 
xPy o u(y, x) > u(y,y ), 

C(A) — {x e A | x e X{x) and yPx=*y<£An A'(x)}, 

C{A) ={xsA\yP x => y $ A}. 

The necessary relationships are given by the following: 

Lemma. If, for all x e X, x e Xfx), then: 

(1) P = P; 

(2) for all AC X, C(A) = C(A). 

Proof. (1) (a) If xPy, then u(y, x) > u(y, y). Since yeX{y), 
“(y> y) = exp[«(y, y)]. Since x e X(y), u(y, x) = exp [u(y, x)]. Therefore 
o(y,x) > u(y,y), i.e., xPy. 

(b) If xPy, then w(y, x) > ft(y, y). Since yeX(y), u(y,y ) = 

5 This is certainly satisfied by four out of five of Poliak’s short-run utility functions, 
namely those corresponding to (1.1) (1.2), (1.3) and (1.5) on p. 746 of [5], with dynamics 
given by (2.2) on p. 749. (1.4) is peculiar, because the consumption set is bounded 
above rather than below. 



ENDOGENOUS TASTES AND CHOICE 


335 


exp[« 0 , y)]. Therefore xeX(y), and so u(y, x) = exp[u(y, x)]. Thus 
u(y, x) > u(y, y) and so xPy. 

(2) y P x => y e X(x). Also, x e A"(x), for all xe X. Therefore 
C(A ) — {xe A]y P x =*■ y$ A) — C(A) because P = P. 

Finally, note that it is always possible to construct a conservative 
sequence (x t ) with x M e C(x t )(A) (each t) .where 

C(x t )(A) = {ye An X(x t ) \ze An X(x t ) => u(x t , y) > u (x ,, z)}. 

Moreover, if ze A, then u(x t , jr <+1 ) > u(x t , 2 ), whether or not 2 e X(x t ), 
and so x (+1 e C(x t )(A), where 

C(*,)(,4) = {y e A \ z e A => ii{x ,, y) > m(x< , 2 )}. 

Thus, any “proper” conservative sequence, for the choice functions 
C(x)(), is also an “improper” conservative sequence, for the choice 
functions C(x)(-). 

It follows that Theorems 1, 2, 3 all remain true when the consumption 
set is endogenous, provided that 

(a) For all xe X, xe X(x); 

(b) The extended utility function «(•, •) is continuous onlxl 


4. Tastes Dependent on Weighted Averages of Past Consumption 

Poliak [5] and McCarthy [3] have also considered the case in which 
short-run utility does not depend just on consumption the previous period, 
but on gepmetrically weighted averages of consumption in all previous 
periods. This case can, however, be handled by reinterpreting the vector x. 
Let c e denote the consumption vector in period t. For each good f, define: 

= (1 - S,) £ 8(0 < S, < 1).« 

i-0 

Write S for the diagonal matrix whose nonzero elements are 8j, S 2 ,8 g ,... . 
Then 


Xt = (/ - 8) £ , 


1-0 


* This is (2.7) of [5, p. 750], in revised notation, and with the “memory coefficients” 
8< varying between goods. See also [3, p. 256], 



336 


PETER J. HAMMOND 


and SO 

<■* = (/- - Sjc t _ a ). 

So the utility function u(x ,^, c,) can be rewritten as 

ffOt-j, x,) = u(x t ^ , (I — S) -1 (x* — Sx,_ a )). 

Now just work with the weighted averages x t rather than with consumption 
itself, c t . All the results go through, although some of the definitions, 
conditions and assumptions have a different interpretation. For example, 
the long-run preference relation P is now given by 

xPy <*■ u(y,x) > u(y, y), 

in contrast to the relation P defined in Section 1. Of course, if x, = x,_ t , 
then c t = x,_i, and so long-run choice does correspond to choosing a 
stationary consumption sequence. Also, if x ( —x*, then r, -*■ x* = c*, 
so stability for x implies stability for consumption. So the essentials— 
long-run preferences, and stable choice—have their meaning preserved. 
In particular. Theorems 1, 2, and 3 all remain true, with the relation P 
replacing the relation P . 


5. Discussion of the Results 

Three stability theorems were proved in Section 2, and applied to 
developments of the original model in Sections 3 and 4. 

Theorem 1 shows that long-run preferences are acyclic if and only if any 
conservative choice sequence from a finite set converges. Naturally, 
the limit must be a long-run choice. Of course, while this is a stability test 
for the acyclicity of long-run preferences, it does not demonstrate the 
existence of a long-run utility function. In fact, it is easy to provide an 
example where long-run strict preferences are not only acyclic, but are 
transitive, and yet there is no long-run utility function, nor even a long-run 
preference ordering. 

Example 1. Let the short-run utility function be 
w(x, >’) = min {y,/x,}. 

Then the long-run preference relation P satisfies 

y P x o y, > x, (each i). 

So P is transitive, but does not correspond to an ordering. 



ENDOGENOUS TASTES AND CHOICE 


337 


Theorem 3 shows that, provided long-run preferences are acyclic, a 
unique long-run choice from a compact set is always globally stable, in the 
sense that any conservative choice sequence converges to the unique 
long-run choice. 

The difference between Theorems 1 and 3 is that, whereas conservatism 
ensures convergence in a finite set, it does not ensure convergence in a 
compact set. In fact, if the long-run choice set contains multiple choices, 
it is possible for a conservative choice sequence not to converge to a 
single point, despite having acyclic long-run preferences—indeed, despite 
having a continuous long-run utility function. This can be seen from 

Example 2. Let X be the closed interval [0, 6 ] of the real line. Define 
the value of the short-run utility functions u(x, y ), on the set X, as 


(a) 

(i) 

( 2 - 

x)(y - 

X) 


(0 

< 

X 

< 

2,0 

< 

y < 5 - fx); 


(ii) 

00 

— 3x)(6 

— X 

-y) 

(0 


X 

< 

2,5 

— 

\x < y < 6 ); 

(b) 

(0 

0 




(2 


X 


3,0 

<; 

y < 4); 


(ii) 

4(3 

- x)(4 - 

-y) 


(2 

< 

X 

< 

3,4 

< 

y < 6 ); 

(c) 

(i) 

4(* 

-3 )(y 

- 2 ) 


(3 


X 


4,0 

< 

y < 2 ); 


(i«) 

0 




(3 


X 


4,2 

< 

y < 6 ); 

(d) 

(i) 

(3x 

- 8 )(x + y - 

- 6 ) 

(4 


X 


6 . 0 

< 

X 

I 

V/ 


(ii) 

(*- 

- 4)<jc - 

y) 


(4 

< 

X 

< 

6,4 

— 

{x <; y ^ 6 ). 

e that for all 

(*. y) e 

X x 

X , u (6 

— X 

6 

— 

y) 

= u(x. 

y). Then it is 


easy to verify that u(x, y ) is continuous. 

Let S denote the subinterval [2,4]. Then, for all xeX: u(x, x) — 0 
and u(x, 6 x) — 0. Also, u(x, y) > 0 if the distance d(y, S) of the point 
y from the set S is less than the distance d(x, S ) of the point x from the 
set S. Hence there is a continuous long-run utility function: 

e(x) = —d(x, S ). 

But the following sequence (*,), starting from x 0 = 0, is a conservative 
choice sequence for the compact set X, which does not converge: 

(4 -f 2 1 -' (r odd) 

“ 12 - 2 1_l (/ even). 

Note, however, that the duster points 2 and 4 are in the long-run choice 
set, as promised by Theorem 2. 




338 


PETER J. HAMMOND 


So Theorem 2 is the strongest general result we can expect when 
multiple long-run choices occur. It says that any cluster point of a 
conservative choice sequence is a long-run choice. Oscillations between 
the neighborhoods of multiple long-run choices are still possible. 


6. Application to Long-Run Demand Theory 

So far, we have been considering general choice situations rather than 
a consumer choosing from a budget set. Thus, in our analysis 

(i) The feasible set A need not be a budget set; it just needs to be 
compact. For example, it can be a production set. And indivisibilities and 
nonconvexities are allowed in the consumption set X, too. 

(ii) There is no presumption that the long-run preference relation is 
an ordering, or that it corresponds to any long-run utility function. 
The point of this remark is illustrated by Example 1 of Section 5. 

Nevertheless, Theorem 3 has immediate consequences for the theory of 
long-run demand. 

Given the long-run consumption set X, and the short-run consumption 
sets X(x) (x e X), define: 

(a) for each strictly positive price vector p, and income m, the 
long-run budget set: 


B(p, m ) — {x e X | px < m} 
and, for each x e X, the short-run budget set: 

B(x ; p, m) — {y e X(x) \ py ^ m)\ 

(b) the set of possible budgets in the long-run: 

Z — \{p, m)\ p >0 and 3x e B(p, m)} 
and, for each x e X, the set of possible budgets in the short run: 

Z(x) ~ {(p, m) | p > 0 and 3 y e B(x\p, m)}. 

Theorem 4. Suppose that : 

(i) For each x £ X and each (p,m)e Z(x), there is a single demand 
vector y — h(x; p, m) which maximizes the short-run utility function u(x, y) 
over the short-run budget set B(x; p, m); 



ENDOGENOUS TASTES AND CHOICE 


339 


(ii) For each xeX, x e A'(jc) and the function u(x, y) defined in 
Section 3 is continuous', 

(iii) For each (p, m)e Z, the long-run budget set B { p, m) is compact, 
and there is a unique long-run demand vector x — h(p, m) such that 
x = h(x; p, m); 

(iv) There is a long-run utility function v(x) defined on X such that, 
for each (p, m) e Z, the long-run demand vector x — h{p, m ) maximizes 
v(x) over the long-run budget set B(p, m). 

Then for any (p,m)eZ, the unique long-run demand x — h{ p, m) is 
globally stable, in the sense that, if x t is a demand sequence, with 

x t = h{x t _i ;p, m) (t = 1, 2,...), 
then Xi —► x as t —*■ co. 

Proof. The theorem is an obvious consequence of Theorem 3 in 
Section 2. The only difficulty arises from the endogeneity of the con¬ 
sumption set X(x), but this can be accommodated by considering the 
short-run utility function u(x, y) defined on the whole of X, as in Section 3. 

Theorem 4 establishes that long-run demands are globally stable in 
four out of the five cases Poliak considers in [5]. T In these cases, all the 
conditions of Theorem 4 are satisfied. And, of course, they are satisfied 
in more general situations besides. Also, global stability is still true when 
tastes depend on weighted averages of past consumption, because of the 
remarks in Section 4. 

One might hope to derive more powerful results within the special 
framework of long-run demand theory, rather than the general framework 
of long-run choice theory. This is left for later work . 8 


7 The exceptional fifth case is (1.4) on p. 746 of [5], for which the consumption set is 
bounded above and unbounded below. 

* One difficulty in extending von Weizsacker’s results in [8] to more than two goods 
is illustrated by an example which was shown to me by James Mirrlees. The consump¬ 
tion set X is the positive orthant of E ,, and the short-run utility function is: 

u(x, y) = | log y x + [(log y, + x 1 log y,)/2(l + xj\. 

There is a well-defined single valued long-run demand function, which, however, 
violates the Slutsky symmetry conditions, so it does not maximize a long-run preference 
ordering. Yet the single valued short-run demands converge to the long-run demand 
in a single period. 



340 


PETER J. HAMMOND 


References 

1. R. H. Day and P. E. Kennedy, Recursive decision systems: an existence analysis, 
Econometrlca 38 (1970), 661-681. 

2. W. M. Gorman, Tastes, habits and choices, Ini. Econ. Rev. 8 (1967), 218-222. 

3. M. D. McCarthy, On the stability of dynamic demand systems, lnt. Econ. Rev. 15 
(1974), 256-259. 

4. A. Mas-Colell, An equilibrium existence theorem without complete or transitive 
preferences, J. Math. Econ. 1 (1974), 237-246. 

5. R. A. Pollak, Habit formation and dynamic demand functions, J. Polit. Econ.. 78 
(4, part 1) (1970), 745-764. 

6. R. A. Pollak, Habit formation and long-run utility functions, J. Econ. Theory 
13 (1976), 272-297. 

7. H. F. Sonnenschein, Demand theory without transitive preferences, with applica¬ 
tions to the theory of competitive equilibrium, in “Preferences, Utility and Demand” 
(J. S. Chipman, L. Hurwicz, M. K. Richter and H. F. Sonnenschein, Eds.), Chap. 10. 
Harcourt Brace Jovanovich, New York, 1971. 

8. C. C. vov WeizsAcker, Notes on endogenous change of tastes, J. Econ. Theory 3 
(1971), 345-372. 


Printed in Belgium 



JOURNAL OF ECO 


IIC THEORY 


Volume 13, Number 3, December 1976 


Copyright © 1976 by Academic Press, Inc. 
All Rights Reserved 


No part of this publication may be reproduced or transmitted in any form or by any 
means, electronic or mechanical, including photocopy, recording, or any information 
storage and retrieval system, without permission in writing from the copyright owner. 


Published bimonthly at 37 Tempclhof, Bruges, Belgium 
by Academic Press, Inc., Ill Fifth Avenue, New York, N.Y. 10003 
1976: Volumes 12-13. Price per volume: $39.50 U.S.A.; 

$42.50 outside U.S.A. (plus postage). 

1977: Volumes 14-16. Price per volume: $44.00 U.S.A.; 

$47.00 outside U.S.A. (plus postage). 

Information concerning personal subscription rates may be obtained 
by writing to the Publisher. 

All correspondence and subscription orders should be addressed to the office of the 
Publishers at 111 Fifth Avenue, New York, N.Y. 10003. 

Send notices of change of address to the office of the Publishers at least 
6-8 weeks in advance. Please include both old and new addresses. 

Second class postage paid at Jamaica, N.Y. 

Air freight and mailing in the U.S.A. by Publications Expediting, Inc., 

200 Meacham Avenue, Elmont, New York 11003. 

.Copyright © 1976 by Academic Press, Inc. 

Printed in Bruges, Belgium, by the St. Catherine Press, Ltd. 





JOURNAL OF ECONOMIC THEORY 13, 341-360 (1976) 


The Arbitrage Theory of Capital Asset Pricing 

Stephen A. Ross* 

Departments of Economics and Finance, University of Pennsylvania, 
The Wharton School, Philadelphia, Pennsylvania 19174 

Received March 19, 1973; revised May 19, 1976 


The purpose of this paper is to examine rigorously the arbitrage model 
of capital asset pricing developed in Ross [13, 14], The arbitrage model 
was proposed as an alternative to the mean variance capital asset pricing 
model, introduced by Sharpe, Lintner, and Treynor, that has become the 
major analytic tool for explaining phenomena observed in capital markets 
for risky assets. The principal relation that emerges from the mean 
variance model holds that for any asset, /, its (ex ante) expected return 

E, = p + A b {, (1) 

where p is the riskless rate of interest, A is the expected excess return 
on the market, E m — p, and 


bi - of m /o m \ 

is the beta coefficient on the market, where cr m 2 is the variance of the 
market portfolio and af m is the covariance between the returns on the ith 
asset and the market portfolio. (If a riskless asset does not exist, p is the 
zero-beta return, i.e., the return on all portfolios uncorrelated with the 
market portfolio.) 1 

The linear relation in (1) arises from the mean variance efficiency of the 
market portfolio, but on theoretical grounds it is difficult to justify 
either the assumption of normality in returns (or local normality in 
Wiener diffusion models) or of quadratic preferences to guarantee such 
efficiency, and on empirical grounds the conclusions as well as the 


* Professor of Economics, University of Pennsylvania. This work was supported 
by a grant from the Rodney L. White Center for Financial Research at the University 
of Pennsylvania and by National Science Foundation Grant GS-35780. 

1 See Black [2] for an analysis of the mean variance model in the absence of a riskless 
asset. 

341 

Copyright 0 1976 by Academic Pres*, Inc. 

All righto of reproduction in any form reserved. 



342 


STEPHEN A. ROSS 


assumptions of the theory have aiso come under attack.® The restrictiveness 
of the assumptions that underlie the mean variance model have, however, 
long been recognized, but its tractability and the evident appeal of the 
linear relation between return, E { , and risk, b t , embodied in (1) have 
ensured its popularity. An alternative theory of the pricing of risky assets 
that retains many of the intuitive results of the original theory was 
developed in Ross [13,14]. 

In its barest essentials the argument presented there is as follows. 
Suppose that the random returns on a subset of assets can be expressed 
by a simple factor model 

x t ~ E, + &S + e,, (2) 

where <5 is a mean zero common factor, and e, is mean zero with the 
vector <e> sufficiently independent to permit the law of large numbers to 
hold. Neglecting the noise terra, e,, as discussed in Ross [14] (2) is a 
statement that the state space tableau of asset returns lies in a two- 
dimensional space that can be spanned by a vector with elements , 
(where 9 denotes the state of the world) and the constant vector, 
e = <1 . 1 >. 

Step ]. Form an arbitrage portfolio, r/, of all the n assets, i.e., a 
portfolio which uses no wealth, rje — 0. We will also require tj to be a 
well-diversified portfolio with each component, 77 ,, of order ]/« in 
(absolute) magnitude. 

Step 2. By the law of large numbers, for large n the return on the 
arbitrage portfolio 


77X = r)E -f- {77/8) S 4 7)i 

(3) 

** rjE -}- (7jjS) d. 

In other words the influence on the well-diversified portfolio of the 
independent noise terms becomes negligible. 

Step 3. If we now also require that the arbitrage portfolio, tj, be chosen 
so as to have no systematic risk, then 


and from (3) 


= 0 , 

rjx r)E. 


* See Blume and Friend [3] for a recent example of some of the empirical difficulties 
faced by the mean variance model. For a good review of the theoretical and empirical 
literature on the mean variance model see Jensen [6J. 



CAPITAL ASSET PRICING 


343 


Step 4. Using no wealth, the random return tjx has now been engi¬ 
neered to be equivalent to a certain return, ij E, hence to prevent arbitrarily 
large disequilibrium positions we must have r)E = Q^j i ipp M his restriction 

must hold for all 17 such that rje — rjf3 — 0, E isj 

-*$■ , 

£, = p + A p, % ■ 

for constants p and A. Clearly if there is a riskless, asset, p feqft be its 
rate of return. Even if there is not such an asset, pjfcfke ra^.'pf 'feturn on 
all zero-beta portfolios, a, i.e., all portfolios with %« * Fand <xj3 — 0. 
If a is a particular portfolio of interest, e.g., the market portfolio, a m , 
with E m — tx„E, (4) becomes 



E< = p + <E m - p) ft. (5) 

Condition (5) is the arbitrage theory equivalent of (1), and if S is a 
market factor return then 78 , will approximate b, . The above approach, 
however, is substantially different from the usual mean-variance analysis 
and constitutes a related but quite distinct theory. For one thing, the 
argument suggests that (5) holds not only in equilibrium situations, but 
in all but the most profound sort of disequilibria. For another, the market 
portfolio plays no special role. 

There are, however, some weak points in the heuristic argument. For 
example, as the number of assets, n, is increased, wealth will, in general, 
also increase. Increasing wealth, though, may increase the risk aversion of 
some economic agents. The law of large numbers implies, in Step 2, that 
the noise term, rji, becomes negligible for large n, but if the degree of risk 
aversion is increasing with n these two effects may cancel out and the 
presence of noise may persist as an influence on the pricing relation. 
In Section I we will present an example of a market where this occurs. 
Furthermore, even if the noise term can be eliminated, it is not at all 
obvious that (5) must hold, since the disequilibrium position of one agent 
might be offset by the disequilibrium position of another.® 

In Ross [13], however, it was shown that if (5) holds then it represents 
an t or quasi-equilibrium. The intent of this paper is to supply the rigorous 
analysis underlying the stronger stability arguments above. In Section TI 
we will present some weak sufficient conditions to rule out the above 
exceptions (and the example of Section 1) and we will prove a general 
version of the arbitrage result. Section IT also includes a brief argument 
on the empirical practicality of the results. A mathematical appendix 

a Green has considered this point in a temporary equilibrium model. Essentially 
he argues that if subjective anticipations differ too much, then arbitrage possibilities 
will threaten the existence of equilibrium. 



344 


STEPHEN A. ROSS 


contains some supportive results of a somewhat technical and tangential 
nature. Section III will briefly summarize the paper and suggest further 
generalizations. 


I. A Counterexample 

In this section we will present an example of a market where the 
sequence of equilibrium pricing relations does not approach the one 
predicted by the arbitrage theory as the number of assets is increased. 
The counterexample is valuable because it makes clear what sort of 
additional assumptions must be imposed to validate the theory. 

Suppose that there is a riskless asset and that risky assets are indepen¬ 
dently and normally distributed as 

= Ei + if , ( 6 ) 

where 

E{ii) = 0 , 

and 

£{«,*} =• a 2 . 

The arbitrage argument would imply that in equilibrium all of the 
independent risk would disappear and, therefore, 

Ei « p. (7) 

Assume, however, that the market consists of a single agent with a 
von Neumann-Morgenstern utility function of the constant absolute risk 
aversion form, 

U(z) — -exp(-Az). - ( 8 ) 

Letting w denote wealth with the riskless asset as the numeraire, and a the 
portfolio of risky assets (i.e., «, is the proportion of wealth placed in the 
ith risky asset) and taking expectations we have 

E{U[M P + *[x-p- *])]} 

= — exp( — Awp) £{exp(— Awa[Sc — p ■ e])} 

= —exp(-Awp){exp(—Awa[E - p ■ e] + (a 2 j2)(Awf(<x'<x))}. ( 9 ) 

The first-order conditions at a maximum are given bfty . 


a 2 (Aw) ex, — Ei — p. 


(10) 



CAPITAL ASSET PRICING 


345 


If the riskless asset is in unit supply the budget constraint (Walras* Law 
for the market) becomes 

w — £ <x,w + 1 = (l/Ao*) £ (Ei - p) + 1 . ( 11 ) 

t-i i -1 

The interpretation of the budget constraint (11) depends on the 
particular market situation we are describing. Suppose, first, that we are 
adding assets which will pay a random total numeraire amount, c ( . 
If p { is the current numeraire price of the asset then 

Hi = di/Pi. 

Normalizing all risky assets to be in unit supply we must have 


Pi = 


and the budget constraint simply asserts that wealth is summed value, 

n 

w = £ Pi + l. 

i-l 

If we let Ci denote the mean of c { and e*, its variance, then (10) can be 
solved for /j, as 

Pi = (1 lp){Ci - 4c 2 }. 

As a consequence, the expected returns, 

E, = c ( lp { = p{c</(c ( - Ac*)}, 

will be, unaffected by changes in the number of assets, n, for / < n, and 
need bear no systematic relation to p as n increases. This is a violation 
of the arbitrage condition, (7). Notice, too, that as long as c t is bounded 
above Ac 3 , wealth and relative risk aversion. Aw, are unbounded in n. 

An alternative interpretation of the market situation would be that as n 
increases the number of risky investment opportunities or activities is 
being increased, but not the number of assets. In this case wealth, w, would 
simply be the number of units of the riskless asset held and would remain 
constant as n increased. The quantities a ( w now represent the amount of 
the riskless holdings put into the ith investment opportunity and for the 
market as a whole we must have 

X><1. 

»-l 



346 


STEPHEN A. ROSS 


Furthermore, if the random technological activities are irreversible, 
then each > 0. From (10) it follows that 

E t - p > 0 

and 

f t E i -p=j^\E ( -p\ = a*(Aw) £ a, < o*Aw. 

i-i i-i <-i 

Hence, as n -+ oo, the vector £ approaches the constant vector with 
entries p in absolute sum (the k norm) which is a very strong type of 
approximation. Under this second interpretation, then, the arbitrage 
condition (7) holds. 

An easy way to understand the distinction between these two inter¬ 
pretations is to conceive of the riskless asset as silver dollars, and the 
risky assets as slot machines. In the first interpretation the slot machines 
come with a silver dollar in the slot and p f is the relative price of the ith 
“primed” machine in terms of silver dollars. In the alternative inter¬ 
pretation, the machines are “unprimed” and we invest silver dollars 
in the ith machine. Which of these two senses of a market being “large” 
is empirically more relevant is a debatable issue, and in the next section 
we will develop assumptions sufficient to verify the arbitrage result for 
both cases (and any intermediate ones as well). 


II. The Arbitrage Theory 

The difficulty with the constant absolute risk aversion example arises 
because the coefficient of relative risk aversion increases with wealth. 
This suggests considering risk averse agents for whom the coefficient of 
relative risk aversion is uniformly bounded, 

sup{-(t/'(x).Y/t/'(x))} < R < oo. (12) 

X 

We will refer to such agents as being of Type B (for bounded). 

Pratt has shown that given a Type B utility function, U, there exists a 
monotone increasing convex function, G( ), such that 

U(x) = G[U(x-, K)], (13) 

where U(x ; R ) is the utility function with constant relative risk aversion, R. 
It is well known that 

U(x; R) = ~ ![ R¥z \' 

(log x if R = 1. 


( 14 ) 



CAPITAL ASSET PRICING 


347 


Essentially, then. Type B agents are uniformly less risk averse than some 
constant relative risk averse agents. 

Assume that the returns on the particular subset of assets under 
consideration ate subjectively viewed by agents in the market as being 
generated by a model of the form 

*( = £< + frifr + ••• 4- fr*8* + * 

-£i + A« + *, (15) 

where 

£{«,} = £{«<} = 0 , 

and where the e/s are mutually stochastically uncorrelated. We will 
impose no further restrictions on the form of the multivariate distribution 
of (8, e) beyond the requirement that (3 a < oo) 

< 7 * S £{€,*} < <7*. (16) 

In particular, then, the 8, need not be jointly independent or even inde* 
pendent of the c/s, they need not possess variances, and none of the 
random variables need be normally distributed. 

A point on notation is also needed. In what follows, a 0 will denote an 
n-element optimal portfolio for the agent under consideration, i.e., 
a 0 maximizes £{C/[w<x*]}, subject to ae = 1. The vector p will be the 
column vector <p u ,..., /3„,y and /?,, as above, denotes the row vector 
<j3 a The single letter jS will denote the matrix 

IP 1 : •••• : pi 

Assumption 1 (Liability limitations). There exists at least one asset 
with limited liability in the sense that there is some bound, t, (per unit 
invested) to the losses for which an agent is liable. 

Assumption 1 is satisfied in the real world by a wide variety of assets. 
We can now prove a key result about Type B agents. 

Theorem I. Consider a Type B agent who lives in a world that satisfies 
Assumption 1 and who believes that returns are generated by a model of the 
form of (15). If (3m < co) such that 

oPE ^ m, (17) 

then (3p and a k vector, y) such that 

E tfr - p - fry) 4 < °°- 


(18) 



348 


STEPHEN A. ROSS 


Proof. The result is independent of the particular wealth sequence O n > 
and we must prove it for arbitrary sequences. Assume that R 1. We 
will prove the theorem by constructing a portfolio that bests a° when (18) 
does not hold. First, from (17), concavity and monotonicity 

£■{£/[*"<*«*]} 

< U[w n m] 

= G[(w n f- R U(m\ R)]. 

Now, consider the arbitrage portfolio sequence that solves the associated 
quadratic problem of minimizing unsystematic (e) risk subject to the 
constraints of having no systematic (j8) risk and attaining an expected 
return greater than m + t\ minimize 

rj'Vrj, 


V’P = 0; l = 1,..., Ar, 

rj'£ — c > m + t, 

where V is the covariance matrix of <c> and where t is the maximum 
liability loss associated with a unit investment in the limited liability asset. 
Assumption 1 guarantees that t is bounded. We will also assume, without 
loss of generality, that V is of full rank for all «.* 

If the constraints are unsolvable for all n, then E must be linearly 
dependent on e and the columns of /? and we are done. Suppose then, 
that the constraints are solvable for all n sufficiently large and, without 
loss of generality, let 

X^lE-J-.e] 

be of full rank. 6 

4 Since the «, are uncorrelated, V is a diagonal matrix and will be of less than full 
rank only if some asset has no noise term. If there are two or more such assets the 
arbitrage argument holds exactly and we can eliminate such assets without loss of 
generality. 

* If 1/5] is not of full rank then we can simply eliminate dependent factors. If [/J] is of 
full rank, but 1/3 ; e] is not, then all assets will have a common factor f and we can 
write (15) as 

= E, + f + /3,S + «,. 

Now the proof of Theorem I is essentially unaltered, with the common factor, f 
retained in all portfolios. 


subject to 


and 



CAPITAL ASSET PRICING 


349 


We will assume that if a sequence of random variables converges to a 
degenerate law (a constant) in quadratic mean, then the expected utility 
also converges, and defer a rigorous examination of this point to an 
appendix. It follows that there must not be any subsequence on which 

i}' Vii —► 0. 

If such a subsequence existed then 

E{U(t)x - f, R)) -* U(c -t;R)> U(m; R ), 

and by the convexity of G() there would exist an n such that putting all 
wealth in the limited liability asset and buying the arbitrage portfolio 
would yield 


E{U[w n (j)X - /)]} - E{G[(w*)'-» U({ r,x - f); R)]} 

> GKw-") 1 -* E{U({ v 5c - t): /?)}] 

> G[(w")‘-« U(m\ R)], 

violating optimality. Hence (3 a > 0) such that (Vn) 

rj'Vt) > a > 0 . 

Solving (19) we have 

V-q - XX, 

where A is a (k + 2)-vector of multipliers, and applying the constraints of 
(19) yields 

[X'V-'X] A — [q]. 

It now follows that 

V^ = A'g] 

> a > 0. 

Defining b = (c, 0) we can apply Lemma I in the Appendix to obtain the 
existence of a* and A < oo such that for all n 


where 


(Xa*)’(Xa*) < A < oo, 
a*b = coj* =■ 1 


(20) 



350 


STEPHEN A. ROSS 


or 

a t * = 1/c. 

Defining (1, — y, —/>) = ca*, (20) becomes the desired result (18). 

If R — ], wealth can be factored out of the utility function additively 
and the proof is nearly identical. Q.E.D. 

Theorem f asserts that for a Type B individual, if the optimal expected 
return is uniformly bounded, then it must be the case that the arbitrage 
condition 

£*«/> + PiV 

— P + Y\Pi\ + “■ + YkPik > 

holds in the approximate sense that the sum of squared deviations is 
uniformly bounded. This implies, among other things, that as n increases 

1 E n -p~ p n y | - 0. (21) 

A number of simple corollaries of Theorem 1 are available. If we adopt 
the alternative interpretation, suggested in Section I, that x, is the return 
on the ith activity, then wealth will be confined to a compact interval if 
there are a limited number of actual assets. It is easy to see that if wealth is 
confined to a compact interval on which the utility function is bounded, 
then Theorem 1 will hold for any risk averse agent. We also have the 
following corollary. 


Corollary 1. Under the conditions of Theorem 1 if there is a riskless 
asset then p may be taken to be its rate of return. 


Proof. The return per unit of wealth in the presence of a riskless asset 
is given by 


p + <x(x — p). 


where a. is now the portfolio of risky assets. Deleting the constraint that 
rje — Owe can simply repeat the proof of Theorem 1 with (E — pe) in the 
place of the E vector. Q.E.D. 


Corollary 1, of course, also extends to the alternative interpretation. 

To turn these results into a capital market theory we will assume that 
there is at least one Type B individual who does not become negligible 
as the number of assets, n, is increased. The following definition is helpful. 


Definition. The agent, a\ will be said to be asymptotically negligible 
if, as the number of assets increases, 

w p w“jw —► 0 , 



CAPITAL ASSET PRICING 


351 


where w is the agent’s wealth and w is total wealth, i.e., 

w £ * v ‘'- 

V 

For example, an agent will not be asymptotically negligible if the 
sequence of proportionate quantities of assets the agent is endowed with 
is bounded away from zero. 

Assumption 2 (Nonnegligibility of Type B agents). There exists at 
least one Type B agent who believes that returns are generated by a model 
of the form of (15) and who is not asymptotically negligible. 

To permit us to aggregate to a market relation we will make three more 
assumptions; essentially we must ensure that Theorem I will not be 
“undone” by the rest of the economy. First we assume that agents hold 
compatible subjective beliefs. 

Assumption 3 (Homogeneity of expectations). All agents hold the 
same expectations, E. Furthermore, all agents are risk averse.® 

Assumption 4 (Extent of disequiiibria). Let denote the aggregate 
demand for the ith asset as a fraction of total wealth. We will assume that 
only situations with ^ 0 are to be considered. 

Notice that Assumption 4 does not rule out the possibility that an asset 
can be in excess supply; it only implies that the economy as a whole will 
wish to hold some of it. Assumptions 3 and 4 can be weakened consid¬ 
erably as will be shown below, but for purposes of demonstration we have 
chosen to leave them in a stronger than necessary form. 

Lastly, we need to specify the generating model (15) a bit more. 

Assumption 5 (Boundedness of expectations). The sequence, <£<> is 
uniformly bounded, i.e.. 


II Ell s= sup |£ f |< oo. 

i 


( 22 ) 


Assumption 5 will be discussed in Section III. 
We can now prove our central result. 


” The assumption of risk aversion is quite weak since if fair gambles are permitted, 
any bounded nonconcave portions of agents* utility functions would be irrelevant. See 
Raiffa [11] or Ross [12] for an elaboration of this point. 



352 


STEPHEN A. ROSS 


Theorem II. Given Assumptions I through 5, (3p, y) 


Ay}* < oo. (18) 

i»i 


Furthermore, if there is a riskless asset, then p is its rate of return, 1 

Proof. From Theorem I we know that if the conclusion is false then 
for the Type B agent (on a subsequence) 

co. (23) 


Let the total fraction of wealth held by the Type B agent be given by to 0 
and by the rest of the economy by to. If a, denotes the fraction of to held 
in asset i by the rest of the economy then by Assumption 4 


By definition, 


hence 


sk <o°a 4 ° -j- toot, 0. 

X £= 1, 

i 

II E || =5 !£■£,• 

s' 

= I + t5<J,) E i 

i 

— tO° X °‘« 0£ ' + W L • 
< < 


From (23) and Assumption 2 the first sum in the last expression is 
divergent, which together with Assumption 5 (22) implies that 

to X “>£■' -*•—«>• 


Since 

toa,- .= X to-a/, 

where to” is the fraction of wealth held by a\ it follows that 

to X “<£• = X X to v « 4 ”£, 

» t 

= x JX‘^4 

7 Theorems I and II and Corollary I can be extended to the case where (15) holds 
for only a subset of the assets by generalizing the utility function to be a Lebesque 
dominated sequence of functions conditional on the other assets. 



CAPITAL ASSET PRICING 


353 


and for some agent, a*', 


£ w r a t r E i —*■ oo, 

< 

i 

on a subsequence. By Assumptions ] and 3 this contradicts optimality. 
The identification of p with the riskless return follows from Corollary 1. 

Q.E.D. 

Theorem II has a straightforward extension to the alternative inter¬ 
pretation of x ( as the return on activity i. In the extension, though, we can, 
of course, drop Assumption 2 and obtain (18) from Assumptions 1, 3, 4, 
and 5 alone. 

As was shown in Ross [14] the basic result of Theorem 2 can be written 
in a number of empirically interesting and intuitively appealing formats. 
For example, by appropriate normalization it can be shown that 

Ei - P « - p) + ■ • • + P ik (E* - p ), (24) 

where E l is the return on all portfolios with aj 8 * = 0 for j # / and 
af} 1 =■ 1. The constant p is now the return on all afi = 0, i.e., zero-beta 
portfolios. Thus, the risk premium on an asset is the j 8 -weighted sum of 
the factor risk premiums. 

While we have formally proven the main result that the sum of squared 
deviations from the basic pricing relation is bounded above as the number 
of assets increases, it is worthwhile spending some effort to obtain an 
empirical estimate of the size of this bound. To do this we will work 
with a more exact form of our results. Examining the proof of Theorem I 
and Lemma I in the Appendix, we have found a bound to 

I [E n - P - P„yl 2 < c*A 

i=l 

= (Mha)c\ 

or, using the exact form of Lemma I, (obtained by leaving the H" matrices 
in the sum) we have 


I (1 /oWE, -p- fry] 2 < c*/a, (25) 

<-i 

where c is the return premium on the arbitrage portfolio over a risk free 
rate (—t in (19)) and a is the lower limit on the variance of an arbitrage 
portfolio. 



354 


STEPHEN A. ROSS 


If we assume that the market portfolio, as a well-diversified portfolio, 
cannot be grossly inefficient in a mean variance sense, and if we ignore 
ex ante-ex post distinctions, then we can use observed market data (see 
Friend [4] and Myers [9] for the data which follow) to obtain a rough 
estimate of the bound in (25). Over the period from January 1, 1962 to 
December 31, 1971 the yearly market return (Standard and Poor’s 
Composite Index) averaged 7.4% and the risk free rate (prime corporates 
with 1 year to maturity) averaged 5.1 %, for a market risk premium of 

<■ =2.3%. 

The sample variance of the market portfolio in this period was (0.123) 2 , 
and we will assume that no arbitrage portfolio earning the market risk 
premium could have had less than one-half the market variance. Hence, 

a = M0.123) 2 , 


and from (25), 


I (l/cr, 2 )[£, ~ p - fry] 2 < 2(0.023) 2 /(0.123) 2 . 

i-i 


The average residual variance in this period from regressions of asset 
returns on the market portfolio was about 2(0.123) 2 and using this as a 
proxy for <7< l , the average squared discrepancy is approximately 

average(£, — p — j8 ( y) 2 < (1 In) 4(0.023) a . 

Taking the number of assets n to be the combined total of listed issues on 
the NYSE and the Amex on December 31, 1971, about 3000, the average 
absolute discrepancy is given by 

average | £, - p - fry 1 < 2 • 0.023/3000V* = 0.00084, 

or about 1 % of the market return of 7.4 %. 

Of course these estimates are very crude and are only intended to be 
indicative; assets with a high own variance will have a greater latitude for 
discrepancies than those with low own variances. Most importantly, 
though, to the extent that there is significant cross-sectional correlation 
across the <, terms, the addition of further factors should reduce the own 
variance terms, a*, and improve the estimates. 



CAPITAL ASSET PRICING 


355 


III. Generalizations and Conclusions 

One of the strengths of Theorem II is that it does not require the 
stringent homogeneity of anticipations of the mean-variance theory. 
We are now obviously distinguishing between expectations, i.e., the 
E vector, and anticipations, the whole model (15). If other agents have the 
same ex ante expectations, but believe returns are generated in a different 
fashion, then (24) must still hold where /9 is that of the return generating 
model believed to hold by the Type B agent. Of course, this is a bit 
gratuitous since in this model, as in all others, it is necessary to translate the 
results into observable quantities and the usual ex ante-ex post identity 
becomes ambiguous with disparate anticipations. Even if all agents agree 
on (15), however, there is still considerable scope for disagreement on the 
underlying probability distributions. For example, if S represents a market 
or “GNP” factor, then as long as all agents agree on the impact of this 
factor on returns, through /3 a , they can hold a variety of views on the 
distribution of § without violating the basic arbitrage condition, (24). 
Similarly, agents can also disagree on the distribution of the idiosyncratic 
noise terms, e,, without altering (24). The primary difficulty with the 
analysis arises when agents differ in their expectations, E y . Now the proof 
of Theorem II must be modified since, unless all E l vectors are positive 
multiples of the same vector, we cannot be assured that the divergence 
of a v £ T to — oo for r 4-- v, implies that ol v E v -* — oo. This is a fruitful area 
for generalizations. 

It is also possible to weaken the condition that e, be mutually uncor¬ 
related. For example, if the assets can be ordered so that e, and are 
uncorrelated if \i—j\ exceeds a given number, then the analysis is 
unchanged. In general, any weakening that permits a law of large numbers 
to hold should be sufficient, although weaker forms of the law would result 
in weaker approximation norms for the pricing relation (24). 

Lastly, it should be emphasized that (24) is much more of an arbitrage 
relation than an equilibrium condition and may be expected to be quite 
robust. Assumptions 4 and 5 served only to guarantee that the market 
return, 

E m - £ t<E <, (26) 

i 

would be uniformly bounded and this will hold in a wide class of dis¬ 
equilibrium situations. Rather then simply assuming that E m was bounded, 
we chose to make Assumptions 4 and 5 directly to see how sufficient 
conditions for a bounded E m would appear in alternative economic 
situations. For example, Assumption 4 can be weakened if, instead of 


642/13/3-2 



356 


STEPHEN A. ROSS 


having required all (, > 0, we had assumed that I I was bounded, 
i.e., we had bounded the sum of the absolute proportions of wealth placed 
(or shorted) in all assets. This would also be sufficient to bound the 
market return. In practice, these are very weak conditions and easily 
satisfied. 8 

In conclusion, we have set forth a rigorous basis for the arbitrage 
relation and arguments analyzed in Ross [14] (and [13]), and the con¬ 
ditions which are sufficient to support the theory have some intuitive 
appeal. On a less optimistic note, though, while significantly weakening 
the assumption that investors have identical (or homogeneous) anti¬ 
cipations, the arbitrage theory still requires essentially identical expec¬ 
tations and agreement on the /3 coefficients if the identification of ex ante 
beliefs with ex post realizations is to provide empirically fruitful results. 
If this assumption is to be fundamentally weakened, this theory (and all 
others) will require a closer examination of the dynamics by which ex ante 
beliefs are transformed into ex post observations. Such a study properly 
lies in the domain of general disequilibrium dynamics and, in particular, 
should focus on the impact of information on markets. It is one of the 
most difficult and important areas for future research. 


“ A strong form of Theorem 2 can be obtained by assuming that the weighted sum 
of subjectively viewed expected portfolio returns 

£«>>•£ at/E," (FI) 

y i 

is uniformly bounded. This would permit us to delete Assumptions 4, 5, and even 3 
and, formally at least, would allow heterogeneous expectations. Alternatively, we 
could replace Assumption 5 with || E* || < oo, retain Assumption 4 (or the weaker form 
described in Section III) and drop Assumption 3. 

Furthermore, if agents agree on factors and if the actual ex post mocTcl generating 
returns is some convex combination (say wealth weighted, or, for that matter, any 
uniformly sup norm bounded linear operator) of the individual market ex ante models, 
then the basic arbitrage condition will be expressible in ex post observables and, as 
such, will be directly testable. See Ross [14J for a fuller discussion of these issues. None 
of this, however, is very satisfactory. For one thing, it is not clear what is the force of 
these boundedness conditions, particularly when the number of agents is typically 
much larger than the number of marketed assets. As aa example, if we have two Type B 
agents with exactly divergent beliefs (in a sense, whi& Can be made precise in special 
examples) then they can exactly offset each other. There is now no reason to expect 
(FI), unlike (26), to be bounded simply because observed ex post return is bounded. 
For another, we must translate the theory into a statement about observables and this 
requires relating divergent subjective ex ante expectations to ex post ones via the 
“right” generating mechanism in a less ad hoc fashion. This is the problem posed in 
Section III and makes the “strong” version of Theorem II inadequate to stand alone. 



CAPITAL ASSET PRICING 


357 


Appendix 1 


In this appendix we prove the lemma referred to in the proofs of the 
paper. Define a sequence of n x k matrices, by taking the first row, 
the first two rows, and so on of an infinite matrix with k columns. 

Lemma I. Let <A" n > be a sequence of n x k matrices and let <//’*> be a 
sequence of diagonal matrices with diagonal elements </r 1 >, <A*, A a >, and so 
on where, for some h, /;, ^ h > 0 for all i. Assume (3b, a ) ('iX n of full rank) 

b'[X n ‘H”X n ]- 1 b^ a > 0. (Al) 

It follows that (3 a* and A) 

( X n a*)'(X u a *) s; A < oo 

and 

a*'b - 1. 

Proof. The result is trivia! if X 71 is of less than full rank for all n. In 
addition, if X" is of full rank for some n ($t k) then X* is of full rank, 
n ~> n, and we may assume that the sequence (X”) (n Y k) is of full rank 
for all n. By positive definiteness X n 'H n X” is of full rank and (Al) holds. 

Consider the problem: 


subject to 


min(A'"z n )' H n (X n z n ), 
: n 'b =■- 1. 


The solution is given by 

■ : n y[X"'H"X n )-' b, 

where 

y -= (X n z n )' H"(X n z n ) 

= (b'[X n 'H n X n Y l b)- 1 

<■ 1 /a < oo, 

by (Al). Consequently, from the lower bound on </?,> we now obtain 
( X n z n )'(X n z n ) ^ A ^ 1 1 ha < oo. 

Letting^" X n z n implies that y n 'y n < A. If A 1 is a full rank submatrix 
of X” then 

Xz” — y n | X, 

where y n | X is the corresponding subvector of y”, and since y n | X is 



358 


STEPHEN A. ROSS 


bounded in the norm it has a convergent subsequence. Letting y* be its 
limit we must have z" -* a* s X~ l y* on the subsequence. It remains to 
show that ( Vn)(X n a*)'(X n a*) < A. Assume to the contrary that for some h 
(and, therefore, all n > A) 

(X*a*)'(X*a*) > A. 

Since z n —► a* on a subsequence we would have the contradiction 

( X n z n )'(X n z n ) > (X*z”)'(X*z n ) > A for some n. 

It follows that (in) ( X n a*)'(X u a*) < A. In addition, since z n 'b — I 
for all n we must also have a*'b — I. Q.E.D. 


Appendix 2 

In this appendix we discuss the relationship between convergence in 
quadratic mean (q.m.) and expected utility. The technical results can be 
found in Loeve [8] and Billingsley [1], 

We can begin with a simple but powerful result. Let <(£„> be a sequence 
of random variables with £{X„} = 0, and X„ -*■ 0 (q.m.), i.e., <t 2 (X„) -*■ 0. 

Proposition. If U( ) is concave and hounded below ( which implies that 
the domain of U( ) is left bounded ), then 

E{U[p + Xj) - U( P ). 

Proof. By Fatou’s lemma 

lim inf E{U\p -| X„]} > U(p), 

but by concavity 

E{U[p + Xj} ^ U(p), 

hence 

lim E{U[ P + X,,]} = U(p). 

Q.E.D. 

A problem arises when U(-) is unbounded from below. About the 
weakest condition which assures convergence is uniform integrability 
(U.I.): 

lim sup f I U( P + 2TJI d-q,, = 0, 

a-*cc tl J r\ 

4/ a 

0 . - {| u(p + x„)\ > «}, 

where rj„ is the distribution function of X n . 



CAPITAL ASSET PRICING 


359 


A number of familiar conditions imply U.I. If the sequence U(p -f J?„) 
is bounded below by an integrable function the Lebesque convergence 
theorem can be invoked or if (38 > 0) 

sup E{\ V(p + Xn)\ ue } < 00, 

then the sequence is U.I. 

In general, then, the convergence criterion will depend on both the 
utility function and the random variables. It is possible, however, to find 
weak sufficient conditions on the random variables alone, by taking 
advantage of the structure of J? n , but the condition that J? B = (]/«) £, e, ; 
<7, 2 uniformly bounded and <,, e, independent is not sufficient.® 

In the text, it is assumed that all sequences satisfy the U.I. condition, 
and therefore 

£„-+a (q.m.) 

will imply that 

£{[/(^ n )} -* U(a). 


References 

1. P. Billingsley, “Convergence of Probability Measures,” Wiley, New York, 1968. 

2. F. Black, Capital market equilibrium with restricted borrowing, J. Business 45 
(1972), 444—455, 

3. M. Blume and 1. Friend, A new look at the capital asset pricing model, J. Finance 
(March 1973), 19-33. 

4. 1. Frjend, Rates of return on bonds and stocks, the market price of risk, and the 
cost of capital. Working Paper No. 23-73, Rodney L. White Center for Financial 
Research, 19,73. 

5. J. Green, Preexisting contracts and temporary general equilibrium, in “Essays on 
Economic Behavior under Uncertainty” (Batch, McFadden, and Wir, Eds.), 
North-Holland, Amsterdam, 1974. 

6. M. Jensen, Capital markets: theory and evidence. Bell. J. Econ. and Management 
Science 3 (1972), 357-398. 

7. J. Lintner, The valuation of risk assets and the selection of risky investments in 
stock portfolios and capital budgets, Rev. Econ. Statist. (February 1965), 30-55. 

8. M. Loeve, “Probability Theory,” Van Nostrand, Princeton, N. J., 1963. 

9. S. Myers, A reexamination of market and industry factors in stock price behavior, 
J. Finance (June 1973), 695-705. 

10. J. Pratt, Risk aversion in the small and in the large, Econometrica 32 (1964), 
122-137. 

11. H. Raiffa, “Decision Analysis,” Addison-Wesley, Reading, Mass., 1968. 


1 It is not difficult to construct counterexamples by having £/(•) go to — oo rapidly 
enough as x approaches its lower bound. 



360 


STEPHEN A. ROSS 


12. S. Ross, Comment on “Consumption and Portfolio Choices with Transaction 
Costs,” by E. Zabel and R. Multherjee, in “Essays on Economic Behavior under 
Uncertainty” (Batch, McFadden, and Wir, Eds.), North-Holland, Amsterdam, 1974. 

13. S. Ross, Portfolio and capital market theory with arbitrary preferences and 
distributions—The general validity of the mean-variance approach in large markets, 
Working Paper No. 12-72, Rodney L. White Center for Financial Research, 1971. 

14. S. Ross, Return, risk and arbitrage, in “Risk and Return in Finance” (Friend and 
Bicksler, Eds.), Ballinger, Cambridge, Mass., forthcoming. 

15. W. Sharpe, Capital asset prices: A theory of market equilibrium under conditions 
of risk, J. Finance (September 1964), 425-442. 

16. J. Treynor, Toward a theory of market value of risky assets, unpublished 
manuscript, 1961. 



JOURNAL OF ECONOMIC THEORY 13, 361-379 (1976) 


Impossibility Theorems without Collective Rationality* 

Douglas H. Blair 

Department of Economics, University of Pennsylvania , 3718 Locust Walk CP, 
Philadelphia, Pennsylvania 19174 

Georges Bordes 

Laboratoire d'Analyse et de Recherche eeonomiques, 

Unicer site de Bordeaux 1, Avenue Leon Duguit, 33604 — Pessac, France 

Jerry S. Kelly 

Department of Economics, Syracuse University, Syracuse, New York 13210 

AND 

Kotaro Suzumura 

The London School of Economics, Houghton Street, London WC2A 2AE, England and 
The Institute of Economic Research, Kyoto University, Sakyo-ku, Kyoto, Japan 

Received July 23, 1974; revised December 3, 1975 


1. Introduction 

Arrow’s general impossibility theorem [2] demonstrated the incom¬ 
patibility of five conditions on collective choice rules; unrestricted domain, 
nondictat'orship, the Pareto condition, independence of irrelevant alternatives, 
and transitive rationality of the social choice function. The last condition 
requires the existence of a social preference ordering such that, given a set 
of alternatives, the chosen elements are those which are best with respect 
to that ordering. Since the publication of Arrow’s theorem, an extensive 
body of literature has appeared seeking to circumvent the difficulty. 
This paper focuses on attempts to resolve the paradox by weakening the 
collective rationality requirement. 

* This paper represents a consolidation of overlapping work done independently 
by the four authors [3, 5, 13, 26], The authors are indebted to the referees of the Journal 
of Economic Theory, the Review of Economic Studies, and Econometrica for seeing the 
possibility of such a consolidation. Thanks are also due to Donald J. Brown, John A. 
Ferejohn, Robert P. Parks, and Amartya K. Sen. 

361 

Copyright © 1976 by Academic Press, Inc. 

■Ml rights of reproduction in any form reserved. 



362 


BLAIR ET AL. 


For our purpose, it is convenient to decompose Arrow’s collective 
rationality requirement into two parts: 

(a) Rationality. There exists a social preference relation R such 
that the elements chosen out of a set of available alternatives S are those 
which are best in 5 with respect to R. (R will be referred to as a rational¬ 
ization.) 

(b) Transitivity and connectedness of the rationalization. 

The sensitivity of Arrow’s result to the specification of the degree of 
rationality was first noticed by Sen [22]. He continued to impose rationality 
but relaxed the second component to require only connectedness and 
quasi-transitivity (that is, transitivity of strict preference), and showed 
that this weakened collective rationality requirement is compatible with 
the remainder of Arrow’s conditions. Gibbard [10] subsequently proved 
that any society whose collective choice rule meets Sen’s conditions 
contains an oligarchy, a class of individuals who are jointly decisive for 
exclusion of an alternative from the social choice out of a two-element 
set and each of whose members is individually decisive for inclusion of an 
element in the choice from such a set. Any individual who by strictly 
preferring x to y can ensure that y is not socially preferred to x is called a 
weak dictator-, every member of an oligarchy is clearly a weak dictator. 
Mas-Colell and Sonnenschein [14] provided the first published proof of 
Gibbard’s theorem and proved an alternative impossibility result: even if 
weak dictators are to be countenanced, their multiplicity causes quasi¬ 
transitive rational (and otherwise Arrovian) collective choice rules to 
violate a decisiveness condition they call positive responsiveness. Thus, 
demanding merely quasi-transitive rationality of social choice provides 
no satisfactory resolution of Arrow's antidemocratic result. Even the 
smallest nondictatorial oligarchy (of two) fails a requirement of respon¬ 
siveness (which is admittedly quite strong) when there are more than 
two voters; enlarging and “democratizing'' the oligarchy aggregates the 
heterogeneity of individual preferences into widespread social indifference 
rather than intransitivity. 

Further weakening of the consistency requirement imposed by Arrow’s 
collective rationality (while continuing to insist on the existence of a 
rationalization) is entailed by requiring acyclicity (nonexistence of a strict 
preference cycle) instead of quasi-transitivity of the social preference 
relation. The importance of this substitution comes from the observation 
that acyclicity is necessary and sufficient to guarantee that society is able 
to make a nonempty rational choice from any finite subset of the set of 
alternatives. In the case of individuals with acyclic preferences choosing 



COLLECTIVE RATIONALITY 


363 


over an iifinite set of alternatives, Brown [7] has shown that the only acyclic 
collective choice rules which satisfy the remainder of Arrow’s conditions 
and are not oligarchic are those of what he calls collegial polities. Under 
such a procedure, there exists a quasi-oligarchy , a subset of individuals 
whose unanimous assent is a necessary condition for the exclusion of an 
alternative from the social choice out of a two-element set. In contrast 
with the Gibbardian oligarchy, consensus within the quasi-oligarchy, 
though necessary, is not sufficient for exclusion. For this class of decision 
rules, at least one individual outside the quasi-oligarchy must also prefer x 
to y to ensure a similar social preference. Thus, weakening Arrow’s 
transitive rationality to require only acyclic rationality is a step in the 
democratic direction. The complete asymmetry between the power of 
individuals within and outside the oligarchy is diluted when quasi¬ 
transitivity is abandoned. Some non-quasi-oligarchs do have power: they 
are pivotal to the success of some winning coalitions. Nevertheless the 
tradeoffs remain between heterogeneity of preferences, decisiveness, and 
inequalities in the distribution of power, as is shown by another Mas- 
Colell-Sonnenschein theorem which asserts that no acyclic collective 
choice rules exist satisfying both their no-weak-dictators and positive 
responsiveness conditions along with the remainder of Arrow’s conditions. 
This proposition imposes no restrictions on the size of the alternatives set. 
In the case of individuals with acyclic preferences choosing over a finite 
set, Brown [6] has obtained a precise characterization of acyclic Arrovian 
collective choice rules which indicates dearly how they violate the positive 
responsiveness requirement. Collegial polities are of course acyclic even 
on a finite set, and nontrivial ones are obviously unresponsive to changes 
in the preferences of some voters. The only anonymous acyclic procedures 
in the finite case, as Brown shows, are rules satisfying the following 
condition: if M is the number of alternatives, every M-tuple of decisive 
coalitions Of individuals must have nonempty intersection. This class of 
procedures includes simple special-majority rules (e.g., $ majority) and 
representative systems with a special-majority rule at each stage. For 
alternative sets which are large relative to the size of the set of individuals, 
these procedures are close to the unanimity rule. 

Given all these results involving the weakening of Arrow’s transitivity 
requirement, it is not surprising to find attacks focusing directly on 
rationality itself. Both Schwartz [21] and Plott [16-18] have criticized 
the demand for the existence of rationalizations. Plott [18] argues that a 
major reason for Arrow’s insistence on transitive rationality was that it 
ensured that the social choice would be invariant under arbitrary mani¬ 
pulations of the agenda, that is, the order and method by which alternatives 
are compared and inferior ones discarded (see Arrow [2, p. 120]). He 



364 


BLAIR ET AL. 


proposes a consistency requirement for choice functions which he calls 
path independence : 

The alternatives are “split up” into smaller sets, a choice is made over each 
of these sets, the chosen elements are collected, and then a choice is made from 
them. Path independence, in this case, would mean that the final result would be 
independent of the way the alternatives were initially divided up for considera¬ 
tion (Plott [17, pp. 1079-1080)). 1 

A fairly natural question now arises: What happens to impossibility 
theorems when path independence is substituted for transitive rationality 
and the remaining Arrovian conditions prevail? Plott [17, 18] observes, 
citing Sen [22] as a source, that the collective choice rule which chooses 
the Pareto optimal subsets from available alternatives sets serves as a 
counterexample to a proposed impossibility result. Unfortunately this 
collective choice rule runs afoul of Gibbard’s theorem; it is also too 
undiscriminating in the face of heterogeneous individual preferences. 
It is important to notice that there exist path-independent choice functions 
which have no rationalization. Plott’s position on impossibility results 
with path independence but without rationality is ambiguous. He has said 
that “some of the standard constructions in welfare economics such as 
social welfare functions and social preference relations unduly restrict 
the set of admissible policies and consequently induce impossibility results” 
(Plott [16, p. 182]) and that, with the relaxation of rationality, “the 
immediate impossibility result discovered by Arrow is avoided” (Plott 
[17, p. 1075]). He has been careful, however, to observe that “the lines 
which separate rationality properties, which induce immediate impossi¬ 
bility results, from path independence properties are very thinly drawn” 
[17, p. 1075]. Blair [4] and Parks [15] have exhibited examples of collective 
choice rules which can result in path-independent but not quasi-transitive 
choices; both, however, suffer from the defect that they can generate choice 
functions which are not very selective. 

We prove in this paper several impossibility theorems in which we do 
not require social choices to have a rationalization. One of our results 
shows the incompatibility of non-weak-dictatorship, the Pareto condition, 
independence of irrelevant alternatives, and path independence. Thus the 
replacement of Arrow's collective rationality with Plott’s path indepen¬ 
dence does not help us to escape from the Arrovian dilemma. Still weaker 

1 For procedures aggregating preferences over many alternatives, which must of 
necessity be multistage processes due to computational costs, path independence is a 
desirable property for two reasons. First, it rules out certain forms of institutional 
arbitrariness,' such as bias in favor of the status quo. Second, it precludes strategic 
behavior at the agenda-determination stage. 



COLLECTIVE RATIONALITY 


365 


consistency properties for social choice functions than path independence 
will be proposed. They too, however, fail to provide us with a means of 
avoiding impossibility results. As Arrow has written, “the paradox of 
social choice cannot be so easily exorcised” [2, p. 109]'. 


2. The Structure of Choice Functions 

Before presenting our impossibility theorems, we will clarify in this 
section the relationships between, on one hand, the rationality conditions 
used in the existing impossibility theorems and, on the other, path inde¬ 
pendence and some weaker conditions. 

Let X denote the set of (mutually exclusive) alternatives. K stands for 
a family of nonempty subsets of X. Each element S e K is an admissible 
agenda ; it contains the currently feasible alternatives in a given choice 
situation. We assume throughout this paper that K contains all nonempty 
finite elements of X ), the power set of X. A choice function C on K is a 
function which maps each S e K into a nonempty subset C(S) of S'; note 
that C(S) is not required to be a one-element set. Five properties of choice 
functions will be of interest here: 

Path independence (PI). C(S X U S t ) = C(C(6\) U C(S 2 )) for all St, 
Sj e K. 

Chernoff condition (C). S t C S 2 => C(S 2 ) n 5 X C C(S X ) for all S t , S t e K. 
That is, every element chosen out of a set must also be chosen in every 
subset of the set containing the element. 2 

Property p. fo C S t & C(SJ n C(S 2 ) # 0 ] C(5,) C C(S t ) for all 

S lt S 2 e K\ That is, if some chosen element from a set is chosen from a 
superset of that set, then every such element is chosen from the superset. 
(See Sen [24].) 

Superset property ( S ). S X C S 2 => not [C(S 2 ) C C(S X )]. That is, the choice 
out of the superset of a set is not strictly contained in the choice out of the 
set. 

Generalized Condorcet property (GC). (x e S & x e C({x, >’}) for all 
y e S) => x e C(S) for all S e K. That is, if no element in a set beats a given 

1 This condition, first introduced by Chernoff [8J, has appeared in the literature 
under a variety of names including, unhappily, “independence of irrelevant alternatives.” 
It is discussed extensively by Arrow in [1J. 



366 


BLAIR ET AL. 


element x in a binary choice, then x must be among the elements chosen 
from the set * 

A preference relation R is a binary relation on X having the inter¬ 
pretation that xRy iff x is at least as good as y from the point of view of 
the person or group in question. In the usual way we define from R the 
subrelations P of strict preference and / of indifference. P is connected 
iff xRy or yRx, transitive iff xRy & yRz => xRz, quasi-transitive iff P is 
transitive, and acyclic iff (xffx^P ■■■ Px t Pxj) for no finite subset {x, x,} 

of X. A transitive and connected relation will be called a transitive ordering. 

Choice functions induce preference relations in two ways. If there exists 
a preference relation R satisfying C(S) — {xeS: xRy for all y e S} for all 
S e K, the choice function C will be called rational (R); in that event R is 
a rationalization of C. A choice function is transitive rational (TR), 
quasi-transitive rational (QTR), or acyclic rational (AR) if it has a 
rationalization with the requisite property. 

Alternatively, a preference relation may be derived from choice functions 
restricted to two-element agenda sets. Even if C has no rationalization, 
we can always define, following Herzberger [12], the base relation JR* 
as follows: xR*y iff x e C{{x, ^}) for all {x,y)sK. Strict preference P* 
may be defined in the obvious way. A choice function will then be said to 
satisfy base quasi-transitivity (BQT) iff JR* is quasi-transitive, base 
acyclicity (BA) iff R* is acyclic, and base triple acyclicity (BTA) iff 
xP*y & yP*z => xR*z. 

We turn now to comparing these consistency conditions by decomposing 
several of them into more basic parts. Sen [24] has proven that a choice 
function has a transitive rationalization iff it satisfies both property /3 and 
the Chemoff condition. Plott [17], in turn, has shown that quasi-transitive 
rationality is equivalent to the conjunction of path independence and the 
generalized Condorcet property. The relationship between these results 
becomes more apparent when we further decompose path independence. 

Theorem 1. A choice function is path independent if and only if it 
satisfies the Chemoff condition and the superset property. 

Proof. First we show that path independence implies the Chemoff 
condition. Let S X C S z , and let x e 5j n C(Sf. By path independence, 
CIS,) = C[C(5 2 - S,) u C(Sj)3 C C(S 2 - SO U CfSJ. Since xeS lt x<f 
S 2 - S x , so x 4 C(S 2 - Si). Hence x e CCS 1 !); therefore S t n C(S 2 ) C C(5i). 

Next we show that path independence implies the superset property. 

* The Condorcet condition in its usual form is stated in terms of pairwise comparisons 
by simple majority rule. Our condition is a weaker verson of Sen’s property y, discussed 
in f24]. 



COLLECTIVE RATIONALITY 


367 


Suppose, contrary to that condition, that S, C S* and C(S*) £ C(S t ). By 
path independence, C(S*) = C[C(S a ) u CfJj)] = C[C(S,)]. By the first 
part of this proof, the Chemoff condition holds, so that from C(S t ) C S l 
we can derive C(S,) n C(Sf) C C[C(S t )]; thus C[C(5,)] = C(S X ). This 
yields C(Sf) = C(S t ), a contradiction. 

Finally, we obtain path independence from the superset property and 
the Chemoff condition. Suppose x e C(S l u S t ). If x e S x , the Chemoff 
condition implies x e CfSJ; if xe S 2 , then x e C(5g). Hence x e C(Sj) u 
C(S 2 )CSjUSj. By another application of the Chernoff condition, 
x e C[C(S,) u C(S 2 )]. Thus C{S X u S 2 ) C CfCCS^ u C(S 4 )]. The inclusion 
cannot be strict, however, because of the superset property and the fact that 
C(S t u 5 2 ) C C[C(5 X ) u C(5 2 )3. Therefore, C(S, u S t ) = CfCf^) u COS',,)]. 

What we now know is that quasi-transitive rationality is equivalent to 
the conjunction of the Chernoff condition, the superset property, and the 
generalized Condorcet property, and that if the generalized Condorcet 
property is no longer required we have path independence. Suppose we 
retain the Chernoff condition and the generalized Condorcet property but 
do not require that the superset property hold. Theorem 2 demonstrates 
that what remains is acyclicity. 

Theorem 2. If a choice function C on K satisfies the Chernoff condition 
and the generalized Condorcet property and if K contains all finite nonempty 
subsets of X, it has a unique, complete, reflexive, acyclic rationalization. 
If a choice function is induced by an acyclic relation, it satisfies the Chernoff 
condition and the generalized Condorcet property. 

Proof. Beginning with the first proposition, we assume C satisfies the 
conditions stated. Each two-element subset of X belongs to K, so the only 
possible rationalization is the base relation R*\ 

xR*y iff xsC({x,y}). ^ 

If x £ C(S), then, by the Chernoff condition, x e C({x, y}) for each 'f'tS. 
On the other hand, if x e C({jc, y}) for all y e S’, then x e C(S) by the 
generalized Condorcet property. Thus, 

C(S) — {x e S: x e C({x, y}) for all y e S} 

~ {x eS: xR*y for all y e S}, 

i.e., JR* is in fact a rationalization of C. Completeness and reflexivity of R* 
are obvious; it remains to show that R* is acyclic. Suppose that 
x 1 P*x 2 P* ■■■ P*x n , i.e., x, £ C({x f _ x , x t }) for / = 2,..., n. By the Chemoff 



368 


BLAIR ET AL. 


condition, x, $ C({x,, x t x„}) for i = 2,..., n. By our assumption about 
the content of K , C({xi, x 2 x n }) ^ 0, so C({x 2 , x 2 x B }) — {xj}. 
By another application of the Chemoff condition, x x e C({x x , x n }), that is, 
not x„P*Xx , as was to be shown. 

Turning now to the second assertion, for each S £ K, 

C(S ) = {xe 5: x/?y for all ye5} —- {xe S: not (3y)(y eS&yPx)}, 

where R is acyclic. Suppose x e C({x, y}) for all yeS. Then xRy for all 
y e S, that is, x e C(S); the generalized Condorcet property therefore holds. 
Suppose x e C(5 2 ) and xeS Y CS 2 . Now x e CiSj) implies xRy for all 
ye Si, which implies xRy for all y e S±. Hence x e C(Si), and the Chemoff 
condition holds. 

Theorems 1 and 2, coupled with Plott’s theorem, imply that the only 
path-independent choice functions which are acyclic rational are those 
which are quasi-transitive rational as well. We have earlier remarked that 
path-independent choice functions exist which are not rational (see 
Plott [17]). It should now be clear that there exist rational choice functions 
which violate path independence; indeed, the choice function induced by 
any acyclic but not quasi-transitive preference relation falls in this 
category. 

The following example shows that the Chernoff condition implies neither 
path independence nor the existence of a rationalization : 

Example. Let A'=■ {x, _v, r] and K — .'P(X) — { 0 }. The choice 
function defined by C(X ) = {x} and C(S) = S for all S £ X is easily 
shown to satisfy the Chernoff condition. It is not path-independent, 
however, since C({x, y, 2 }) — {x} £ {x, y] — C({x, y}), which contradicts 
the superset property. If C has any rationalization it must be universal 
indifference, given C(S) — 5 for all two-element 5, but this contradicts 
C(X) - {x}. 

Finally, we relate path independence and the Chernoff condition to the 
properties of the base relation. 

Lemma 1. Path independence implies base quasi-transitivity. The 
Chernoff condition implies base acyclicity. 

Proof. Suppose that C is path-independent and that xP*yP*z 
for some x, y, z e X. By path independence, {x} = C({x, >■}) = 
C[C({x}) u C({ y, z})] = C({x, y, 2 }) = C[C({x, y}) u C({ 2 )}] - C({x, 2 }). 
Hence xP*z, so R* is quasi-transitive. 

The second proposition is established by an argument already given 
in the first part of the proof of Theorem 2. 



COLLECTIVE RATIONALITY 


369 


Counterexamples to the converses of Lemma l’s assertions are left for 
the reader to construct. 

We conclude this section with an implication diagram which summarizes 
the results presented here and other relationships which follow easily 
from the definitions. The properties in parentheses are jointly equivalent 
to the conditions above them. Note that the equivalence of rationality 
and acyclic rationality is dependent on our assumption that K includes all 
finite nonempty subsets of X. 

TR-* QTR-► AR «-► R 

(C, p) (C, S, GC) (C, GC) 

•jf > ■ 

PI--c 

(c, S) 

1 I 

BQT - ——BA-* BTA 


3. Impossibility Theorems 

Suppose that there are n individuals and let N — {1,2,...,«}; it is 
assumed that 2 < n < oo. X stands for the set of social alternatives, 
now taken to have at least three elements. The problem at hand is to 
characterize the institutionally and ethically “acceptable” collective 
choice rules; such a rule F is a function which maps each profile, or 
«-tup)e of individual transitive preference orderings of X, into the set of 
choice functions on K. (Note that what is called here a collective choice rule 
is analogous to Arrow’s {2] social welfare function, and that choice 
functions play in this analysis the same role as Arrow’s social preference 
ordering.) The domain of F is the set of all logically possible profiles. 
Formally, given a profile k, society’s choice function C is given by 
C — F(k). However, the function F will be fixed throughout the proof of 
each of the subsequent theorems. We will therefore frequently not refer 
explicitly to F, but rather will simply write C k (S) for the choice out of 
agenda set S under profile k, given the fixed collective choice rule. When 
the profile is invariant, we will suppress the superscript as well and merely 
write C(Sj. 

AH of the rationality and consistency requirements studied in Section 2 
are properties of choice functions, that is, of elements of the range of 
collective choice rules. Each of these conditions will also be attributed 



370 


BLAIR ET AL. 


to a collective choice rule F in the event that, for every profile, the choice 
function determined by F satisfies the given condition. For example, 
we will call a collective choice rule path-independent if the image of every 
profile under the rule is a path-independent choice function. 

A further set of definitions is necessary before proceeding to the results 
of this section. 

A set of individuals /C /Vis decisive for x against y (weakly decisive for x 
against y) iff xP { y for all i e J and yP,x for all i $ J implies {x} = C({x, y}) 
{x e C({x, y})). 

If V is decisive for some a against some b, and W being decisive for some 
x against some y implies that the number of individuals in W is at least 
as great as the number of individuals in V, then V is a smallest decisive set. 

Individual i is a dictator (weak dictator) iff for all x, y e X, xP { y implies 
{*} = C({jc, y}) (x e C({x, y})). 

A collective choice rule is said to satisfy: 

The Pareto condition iff for any profile k such that xP,y for all i e N, 
we have {x} -- C*({x, y}). 

Nondictatorship iff there exists no dictator. 

Non-weak-dictatorship iff there exists no weak dictator. 

Positive responsiveness iff k is a profile resulting in x e C. k ({x, y}) and l 
is another profile with R, k — R/ for all / # i and either (yP, k x & x/,*y) or 
(xlfy & xP,'y), implies {x} = C'({x, y}), where ie N is any specified 
individual. 

Independence of irrelevant alternatives iff for any two profiles 
j — (i?i,..., R n ) and k ~ (Jl/,..., R n ') such that (xil,y •=- xR/y & 
yR ( x yR/x) for all / e N, we have C>({x, y}) = C k {{x, y}). 

Notice that our independence condition restricts its attention to choices 
from two-element sets, in contrast with Arrow’s independence axiom. 
Although the two conditions are equivalent for rational collective choice 
rules, our axiom is strictly weaker if rationality is not imposed. 

Profiles will be written horizontally with more preferred alternatives 
to the left; indifference will be indicated by parentheses. For example, 
the expression 

V: x, (y, z) 

means that every individual in V prefers x to both y and z, between which 
indifference prevails. 

We now proceed to establish an impossibility theorem using only path 
independence rather than Arrow’s transitive rationality. The theorem is 



COLLECTIVE RATIONALITY 


371 


otherwise Arrovian except for a slight strengthening of the nondictatorship 
condition. 

Theorem 3. If there are at least three voters, there is no collective 
choice rule satisfying all of: 

(1) path independence, 

(2) the Pareto condition, 

(3) independence of irrelevant alternatives, and 

(4) non-weak-dictatorship. 

In view of Lemma 1, we can establish this proposition by proving the 
following stronger result, which utilizes the weaker condition of base 
quasi-transitivity. An alternative proof of Theorem 4 is given by 
Fishbum [9, Theorem 16.2], 

Theorem 4. If there are at least three voters, there is no collective 
choice rule satisfying all of : 

(1) base quasi-transitivity, 

(2) the Pareto condition, 

(3) independence of irrelevant alternatives, and 

(4) non-weak-dictatorship. 

This proposition follows immediately from the three lemmas below. 

Lemma 2. If a collective choice rule satisfies the Pareto condition, 
independence of irrelevant alternatives, and base quasi-transitivity, and if i 
is decisive for some x, y e X, then i is a dictator; that is, 

' (xP t y & yP,x for all j ¥= i => {x} = C({x, y})) 

=> (For all s,t s X: sPj => {$} = C({j, f})). 

Proof. We first show 

(xP t y Sc yPjX for ally # / => {x} = C({x, y})) 

(For all s e X: sP ( y => {s} = C({j, y})). (1) 

Suppose not. Then there exists a profile such that xP t y and yP } x for all 
j ¥= i implies {x} = C({x,y}), sP t y, and {s} # C({j,y}): 

i: s, x, y, 

N — {i}: (some n — 1-tuple of orderings of y and s), x. 

By assumption, {x} = C({x, y}). By the Pareto condition {s} = C({j, x}). 


6 4*/i3/3-3 



372 


BLAIR ET AL. 


Base quasi-transitivity then implies that, under this profile, { 5 } = C({s, y}). 
By independence, {s} = C({s, y}) under every profile in which sP t y, since 
no specification has been made of other voters’ preferences between these 
alternatives. This contradiction establishes (1). Next we show 

(For all se X: sP t y & yP s s for all j ^ i =*■ {j} = C({s, y})) 

=> (For all s,teX : sP,t => {s} = C({s, ?})). (2) 

The antecedent of (2) is implied by (1). Suppose (2) is false. Then there 
exists a profile such that sP t y and yP s s for all j =£ i => { 5 } == C({j, >}), 
sP,t, and {s} # C({s, r}): 

/: s,y,t, 

N — {»}: y, (some n — 1-tuple of orderings of s and t). 

By assumption, {s} = C({s,y}) and by the Pareto condition, {>>} = 
C({y, r}). Base quasi-transitivity shows that under this profile {s} = 
C({r, /}). By independence, the social choice is the same for all profiles in 
which sP { t holds, contradicting our assumption and establishing (2). 

Lemma 3. If a collective choice rule satisfies independence of irrelevant 
alternatives, the Pareto condition, and base quasi-transitivity, and if i is 
weakly decisive for some x,yeX, then i is a weak dictator; that is, 

( xP ( y & yP& for all j =£ i => xs C({x, y})) 

=> (For all s, t e X: sP t t => s e C({i, t })). 

The proof of this lemma is virtually identical to that of Lemma 2 and is 
omitted. 

Lemma 4. If V is a smallest decisive set with respect to a and b under a 
collective choice rule satisfying base quasi-transitivity, the Pareto condition, 
nondictatorship, and independence of irrelevant alternatives, then 

V contains at least two individuals, (3) 

and every i e V is a weak dictator. (4) 

Proof. Assertion (3) is obvious from Lemma 2 and nondictatorship. 
To establish (4), we must show: 

If i G V, xP t y => x e C({x, y}) for some x, ye X. 



COLLECTIVE RATIONALITY 


373 


Suppose not. Then for some a, z e X, there exists a profile of the form: 
i: a, z, 

N — {i}: (some n — 1-tuple of orderings of a and z) 

such that {z} — C({a, z}). Let WC V and V — W = {i}. Consider the 
following further specification of the previous profile: 

/: a, b, z, 

W\ (same individual orderings as before between a and z), b, 

N — V: b, (same individual orderings as before between a and z). 

We know C({a, z}) = {z} and, because of V's decisiveness for a over b, 
C({a , h}) — {a}. By base quasi-transitivity, {z} = C({b, z}) under the 
profile in question. But this implies that W is decisive for z against b, 
which contradicts the minimality of V. Thus, if i e V, i is weakly decisive 
for some x against y, x, ye X. By Lemma 3, such an individual is a weak 
dictator, establishing (4). 

Proof of Theorem 4. Every dictator is a weak dictator, so prohibiting 
weak dictators rules out dictators too. If a collective choice rule satisfies 
base quasi-transitivity, independence of irrelevant alternatives, the Pareto 
condition, and has no weak dictators (and thus no dictator either), then 
Lemma 4 yields the conclusion that there must exist a weak dictator, 
which contradiction proves the theorem. 

Mas-Colell and Sonnenschein’s [14] result on the inconsistency of 
quasi-transitivity and positive responsiveness in the presence of the 
Arrovian conditions carries over in a straightforward manner to the 
case of irrational path independence, and thence to base quasi-transitivity: 

Theorem 5. If there are at least three voters, there is no collective choice 
rule satisfying all of: 

(1) base quasi-transitivity, 

(2) the Pareto condition, 

(3) nondictatorship, 

(4) independence of irrelevant alternatives, and 

(5) positive responsiveness. 

Proof. By Lemma 4, there exist at least two weak dictators; call them 
1, 2. Suppose under some profile xP^y and yP^x for some x,yeX; by 
weak dictatorship, {x, y) — C{{x,y}), regardless of the preferences of 



374 


BLAIR ET AL. 


other voters, of whom there exists at least one. This violates positive 
responsiveness. 

In view of the impossibility theorems discussed in the Introduction and 
the new results just presented, one might be tempted to retreat and demand 
the imposition only of the Chernoff condition which, as we have seen, is 
strictly weaker than both acyclicity and path independence. This condition 
is an appealing one to impose on collective choice rules. It is clearly 
desirable in piecemeal choice mechanisms where choices are made from 
unions of choices over subsets. If an alternative fails to be chosen in some 
subset, it need not be considered again at a later stage, for the contra¬ 
positive of the Chernoff condition ensures that the alternative will not be 
among the final choices. Arrow’s justification in [2, pp. 26-27] for his 
independence axiom is obviously instead an argument for this condition. 
See also Sen [23, p. 17]. Nevertheless, as the following theorems demon¬ 
strate, the Chernoff condition standing alone as a rationality condition 
must also be rejected, at least if the Arrovian conditions are found 
compelling. 4 

Theorem 6. If there are at least four voters, there is no collective choice 
rule satisfying all of : 

(1) the Chernoff condition, 

(2) the Pareto condition, 

(3) non-weak-dictatorship, 

(4) independence of irrelevant alternatives, and 

(5) positive responsiveness. 

As in the case of Theorem 3, we will establish this result by proving an 
even stronger proposition. By Lemma 1, the Chernoff condition implies 
base acyclicity, and it is obvious from the definitions that base acyclicity 
implies base triple acyclicity. 

Theorem 7. If there are at least four voters, there is no collective choice 
rule satisfying all of: 

« 

* The existence of collective choice rules which satisfy the Chernoff condition but are 
neither path-independent nor rational is guaranteed by the following proposition, 
which is easily verified: the rule which makes the collective choice equal to the union of 
individuals' choices from the feasible set satisfies the Chernoff condition, if the in¬ 
dividuals' preferences satisfy that condition as well. The group’s choice function is 
precisely the one given in the example in Section 2 if the group has two members with 
the following acyclic preferences: xP t y, yP t z, xl x z\ xP t z, zP a y, xl t y, and the collective 
choice rule is the union-rule just described. 



COLLECTIVE RATIONALITY 


375 


(1) base triple acyclicity, 

(2) the Pareto condition, 

(3) non-weak-dictatorship, 

(4) independence of irrelevant alternatives, and 

(5) positive responsiveness. 

Proof. We will proceed in two steps. First, assuming all of the 
conditions in the theorem except non-weak-dictatorship, we will show 
that there exists a voter who is weakly decisive for some pair of alternatives. 
We will then show that individual is a weak dictator, contradicting the 
third condition in the theorem. 

Step 1. We must show that if a collective choice rule satisfies base 
triple acyclicity, the Pareto condition, independence, and positive respon¬ 
siveness, there exists an individual / e JVand alternatives x,yeX such that: 

(xP,y & yP,x for all / ¥= i) => xe C({x, y}). (5) 

Suppose (5) is false. Then for all x, y e X and all / e N, if a profile a is such 
that under it xP { y and yP t x for all jeN — {»}, then C a ({x,y}) — {>>}. 
Let V be a smallest decisive set; V is decisive for some x against y. Our 
assumption implies that V contains at least two individuals, say 1 and 2. 
Partition V as V — {1, 2} u V*. Consider profile b: 

1: x,y,z 

{2} u V*: z, x, y 
N — V: y, z, x. 

By the definition of V, {jc} — C b ({x, y}), and by assumption, C b ({x, z}) = 
{r}. By base triple acyclicity, z e C"({y, z}). Now consider profile c: 

1: x, y 

2: y, x 

V*: x,y 

N - V: y, x. 

Since V is a smallest decisive set, y e C c ({x, y}). Next consider profile d: 

1: (x, y, z) 

2: z, y, x 

V *: x, z ,y 

N — V: y, x, z. 



376 


BLAIR ET AL. 


Comparing profiles c and d, and noting the conclusion drawn from the 
former, positive responsiveness and independence require that {y} 
C d ({x, >'}). Comparing profiles b and d, the same two axioms require that 
{z} = C d ({ y, z}). Base triple acyclicity then yields z e C d ({x, z}). Next 
examine profile e: 

1: z, x 

2: z, x 

V*: x,z 

N- V: x , z. 

Comparing profiles d and e, positive responsiveness and independence 
require that {z} — C‘({x, z}). This conclusion and independence imply 
that {1,2} is decisive for z against x, so that V = {1,2}. Finally examine 
profile/: 

1: x,y,z 

2: z, x,y 

N - V: y, z, x. 

Since V is decisive for x against y, {x} = C / ({x, y}), while our assumption 
yields { y) — C'({ y, z}). By base triple acyclicity, x e C / ({x, z}), in contra¬ 
diction to our assumption. Thus we have shown that voter 1 is weakly 
decisive for x against z. 

Step 2. In this step we show that if voter 1 is weakly decisive for x 
against y, then he or she is a weak dictator, if there are at least four 
voters. In the presence of positive responsiveness, this can be established 
by proving that for all s, t e X, 

Cs/y & tPfS for all j e N — {1}) s e C({.s, r}). (6) 

We will prove only that for all (el, 

(x/y & tPjX for ally s N — {1}) => x e C({x, r}>. (7) 

The steps from (7) to (6) are sufficiently similar to the ones we use in 
establishing (7) that they may safely be skipped. To prove (7), we first 
examine profile a: 

1: x, y, t 

2: (x, >’), t 

3: y, t, x 

4: y, t , x 

2V — {1, 2, 3, 4}: y, 1, x. 




COLLECTIVE RATIONALITY 


377 


By Step 1 and positive responsiveness, C*({x, y}) = {x}. By the Pareto 
condition, CH.{y, t}) = {y}. By base triple acyclicity, x e C®({x, t}). Now 
consider profile b : 

! •' y, x, t 

2: y, x , t 

3: t, y, x 

4: y, (x, t) 

N-{ 1,2, 3, 4}: t,y,x. 

Comparing profiles a and b, positive responsiveness and independence 
require that C\{x, t}) — {jc}. By the Pareto condition, C 6 ({x, y}) = {y}. 
Base triple acyclicity then implies y e C b ({y, /}). Next examine profile c: 

1: x,y,t 

2: y, t, x 

3: (x, y, 0 

4: y, t, x 

N-{ 1,2, 3, 4}: t,y,x. 

By Step 1 and positive responsiveness, C°({x, y}) — {x}. Comparing 
profiles b and c, positive responsiveness and independence require that 
C c ({y, I}) = {y}. Another application of base triple acyclicity yields 
x e C c ({x, /}). Next consider profile d\ 


1 : 

2 : 

3: 

4: 

N - {1, 2, 3, 4): 


y. x, t 
t, y, x 
y, x, t 
t , y, x 
t, y, x. 


Comparing profiles c and d, positive responsiveness and independence 
require C d ({x, (}) = {x}; by the Pareto condition {y} = C d ({x,y)). Base 
triple acyclicity then yields y e C d ({y, t }). Finally consider profile e: 


1: x, y, t 

2: t, (x, y) 

3: y, t, x 

4: (y, t), x 

N-{ 1,2, 3,4}: t, y, x. 









378 


BLAIR ET AL. 


Comparing profiles d and e, positive responsiveness and independence 
again require {y} — C*({y,t}). By Step 1 and positive responsiveness, 
= {*}. A final application of base triple acyclicity yields 
x e C'({x, /}). In view of the independence axiom, this proves (7). 


4, Concluding Remarks 

Arrow and subsequent writers have modeled the output of collective 
decision-making institutions as binary social preference relations, both 
by analogy with consumers’ preferences in demand theory and as a 
generalization of Condorcet’s proposal that any alternative which received 
a majority of votes against every other candidate should be chosen. 
Such a view, as the well-known series of impossibility theorems demon¬ 
strates, is inconsistent with a set of several democratic requirements. 
In this paper we have shown that binary rationality per se is not the 
culprit in these theorems. 

We have taken a more general view, and required only that the group 
make a nonempty choice from every finite feasible set of alternatives. 
Several weak conditions imposed on the resultant choice functions are 
each shown to contradict one or more of the same democratic require¬ 
ments, even if the choices have no binary rationalization. 


References 

1. K. J. Arrow, Rational choice functions and orderings. Economica N.S. 26 (1959), 
121-127. 

2. K. J. Arrow, “Social Choice and Individual Values,” Wiley, New York, 1963. 

3. D. H. Blair, Possibility theorems for non-binary social choice functions, un¬ 
published manuscript. 

4. D. H. Blair, Path-independent social choice functions: A further result, Econome- 
trica 43 (1975), 173-174. 

5. G. Bordes, Alpha-rationality and social choice: A general possibility theorem, 
unpublished manuscript. 

6. D. J. Brown, Acyclic aggregation over a finite set of alternatives, forthcoming in 
the Rev. Econ. Studies. 

7. D. J. Brown, Aggregation of preferences, Quart. J. Econ. 89 (1975), 456-469. 

8. H. Chernoff, Rational selection of decision functions, Econometrica 22 (1954), 
423-443. 

9. P. C. Fishburn, “The Theory of Social Choice,” Princeton Univ. Press, Princeton, 
N.J., 1973. 

10. A. Gibbakd, Social choice and the Arrow conditions, unpublished manuscript. 

11. B. Hansson, Choice structures and preference relations, Synthese 18 (1968), 443- 
458. 



COLLECTIVE RATIONALITY 


379 


12. H. Herzberqer, Ordinal preference and rational choice, Econometrica 41 (1973), 
187-237. 

13. J. S. Kelly, Two impossibility theorems on independence of path, unpublished 
manuscript. 

14. A. Mas-Colell and H. Sonnenschein, General possibility theorems for group 
decision functions, Rev. Eco/t. Studies 39 (1972), 185—192. 

15. R. P. Parks, Choice paths and rational choice, unpublished manuscript. 

16. C. R. Plott, Ethics, social choice theory, and the theory of economic policy, 
J. Mathematical Sociology 2 (1972), 181-208. 

17. C. R. Plott, Path independence, rationality, and social choice, Econometrica 41 
(1973), 1075-1091. 

18. C. R. Plott, Rationality and relevance in social choice theory, forthcoming in the 
Amer. Econ. Rev. 

19. M. K. Richter, Revealed preference theory, Econometrica 34 (1966), 635-645. 

20. M. K. Richter, Rational choice, in "Preferences, Utility, and Demand” (J. S. 
Chipman, L. Hurwicz, M. K. Richter, and H. Sonnenschein, Eds.), pp. 29-58. 
Harcourt Brace Jovanovich, New York, 1971. 

21. T. Schwartz, On the possibility of rational policy evaluation, Theory and Decision 
1 (1970), 89-106. 

22. A. K. Sen, Quasi-transitivity, rational choice, and collective decisions. Rev. Econ. 
Studies 36 (1969), 381-393. 

23. A. K. Sen, “Collective Choice and Social Welfare,” Holden-Day, San Francisco, 
1970. 

24. A. K. Sen, Choice functions and revealed preference, Rev. Econ. Studies 38 (1971), 
307-317. 

25. K. Suzumura, Rational choice and revealed preference, forthcoming in the Rev. 
Econ. Studies. 

26. K. Suzumura, General possibility theorems for path-independent social choice, 
Kyoto Institute of Economic Research, Discussion Paper No. 077, 1974. 



JOURNAL OF ECONOMIC THEORY 13, 380-395 (1976) 


Issues in Optimal Educational Policy 
in the Context of Balanced Growth* 

R. Manning 

School of Economics, University of New South Wales, 
Kensington, N.S. W, 2033, Australia 

Received October 21,1974 


Introduction 

This paper introduces greater realism into the simple aggregative model 
of the formation and maintenance of a skilled work force that was 
developed in [4], Detailed consideration was given to the mathematical 
properties of development into steady growth in [4], but attention is here 
restricted to balanced growth, which avoids what may be transitory phases 
in educational policy. Even with this limitation, the analysis of complex 
models yields insights into the nature of optimal educational policies. 

For an interesting history of classical economic thought about education, 
Tu [11] should be consulted. More recently, human capital theory has 
developed an individualistic approach to the acquisition of skills (see, for 
example, [2, 7]): aggregative analyses of economic aspects of educational 
policy have been performed by Bottomley [3], Razin [6], Tu [10, 12], and 
Uzawa [13], while some justifications for the use of this approach are given 
in [4], 


Basic Theoretical Model 

An explanation of optimal balanced growth in the basic model is now 
given: this supplements the formal mathematical development of [4] and 
is the basis of the generalizations made here. 

It is supposed that there is one consumption good produced by inputs 
of skilled and unskilled labor according to a quasi-concave, homogeneous 
production function of degree 1; that is, 

C = f(X, Y), (1) 

* I am very grateful for the perceptive comments of the referees. 

380 

Copyright (0 1976 by Academic Press, Inc. 

All rights of reproduction in any form reserved. 



EDUCATIONAL POLICY 


381 


where consumption good output is C, f is the production function, and X 
and Y are the inputs of skilled, and unskilled labor services, respectively. 1 

It is implicitly assumed that skilled workers retain the same ability to 
perform unskilled tasks as unskilled workers do, but 'acquire a capacity 
to perform skilled tasks with a higher productivity than their unskilled 
colleagues. No unskilled worker will be put to a skilled job if there are 
unemployed skilled workers, therefore. Furthermore, the productivity of 
an unskilled worker is always higher at unskilled, than at skilled, tasks. 
This ensures that unskilled workers are never employed in skilled occu¬ 
pations. There is also a presumption that the marginal productivity of 
skilled workers in skilled occupations exceeds that of unskilled workers in 
unskilled jobs, and so a presumption that skilled workers should not be 
used for unskilled tasks. Therefore, X and Y will be regarded as the number 
of skilled and unskilled workers employed in producing the consumption 
good, except in the later discussion of education as consumption. 

X and Y are those parts of the skilled and unskilled workforce not 
engaged in the education sector as teacher or pupil. The total workforce 

L = S+U (2) 

where 5 and U are the total number of skilled and unskilled workers, 
respectively. In balanced growth, there must be enough education to offset 
net population growth which increases the number of unskilled workers, 
and mortality among the skilled workers. If n is the natural rate of net 
population growth, and m is the mortality rate among skilled workers, the 
per capita 2 production of newly skilled workers must be (n + m)S/L. At 
any time, E educators can teach Es students, of whom Es£ graduate. The 
graduation rate £ reflects the length of training, as well as the rate of 
failure, of students. It follows from the supply and demand for graduates 
that a proportion E/S = (n + m)/.r£ of the skilled workforce is engaged 
in education in balanced growth. If there are a. rr- SjL skilled workers 
per capita, then there are (1 — (rt + m)/j£)a skilled workers per capita 
available for the production of the consumption good in balanced growth. 
Using the student-staff ratio s, it can be seen that in balanced growth the 
number of unskilled workers per capita available for consumption good 
production is 1 — (1 + (« + m)/£) a. 

* The notation adopted here is similar to that of the earlier paper [4]. 

* "Per capita” will be used here to denote comparisons with the total workforce. 
The total population is assumed to be a larger number than the workforce by a time- 
independent fraction: In view of this the criteria adopted below are not restrictive. 
However, the class of policies do not admit variations in the age of admission to the 
workforce. The theory must be interpreted as concerning job choice for people with a 
basic schooling completed. See [4]. 



382 


R. MANNING 


The per capita output of the consumption good in balanced growth is 

c ~ C/L = (ML) f{X, Y) = f(X/L , Y/L) 

= m -(» + m)/rD<x, 1 - (1+ (n + »)/£)«). (3) 

For an interior solution to the problem of maximizing sustainable per 
capita consumption, it is necessary and sufficient that the per capita level 
of skill satisfy 

g/(-)/gA- _ 1 + (n + m)H 

df(-)ldY 1 — (n + m)/s£ w 

An “educational golden rule” of a different kind to that of Phelps [5] is 
thus defined. Under perfect competition, factor payments are proportional 
to marginal productivity, so that (4) defines the ratio of skilled to unskilled 
wage rates. Now s is likely to be large (say, in excess of 10), so the optimal 
differential for skill is approximately (n -f m)/£: for realistic values of 
parameters this skill differential is remarkably small—a range of 10 per cent 
to 60 per cent is suggested in [4] as reasonable. 


A General Model of Innate Differences 

The previous model is now generalized. 

Let the population be divided into r (r integer, r > 1) groups character¬ 
ized by innate differences from each other. These differences may be based 
on sex, ability, or health. Each group i — 1,..., r is described by a mortality 
rate m, of skilled workers belonging to that group, a graduation rate £,, 
and a student-staff ratio s, required to maintain that graduation rate. 
Individuals within each group will differ also, but in an essentially random 
fashion. Provided that the groups contain sufficient members, the aggregate 
parameters for each group will be relatively constant. The proportion p t 
of the total workforce in group i is also relatively constant. Of course, 
the natural rate of net population growth is the growth rate of the total 
workforce in each group i — 1,..., r. 

Of the defining characteristics of groups, m t reflects both sex and 
health characteristics, while s, and £< alike mirror differences in ability 
and health. For example, healthy geniuses finish studying more quickly 
than unhealthy geniuses, and also give longer service once trained. 

Let a, be that proportion of the ?'th group who are skilled. Balanced 
growth requires the maintenance of a x ,..., a r . 

The analysis of manpower requirements for each group is the same as 
that for the total workforce in the basic model. Thus, a proportion 



EDUCATIONAL POLICY 


383 


E ( /S< = (n + of the skilled workforce in group i must be engaged 

in education in balanced growth. This supposition that teachers of group 
i belong to group i is for convenience only; in general, teachers may 
belong to any group. But teachers, and skilled workefs, have the same 
productivity whatever group they belong to, so it makes no difference to 
output levels to assign teaching duties in this way. 

Within group i the number of skilled workers available for the 
production of the consumption good is (1 — (n + m { )/sXi) S<, while the 
number of unskilled workers so available is L< — (1 + (n + «!<)/£<) S t , 
where Li is the workforce in group i. It follows that the per capita output 
of the consumption good for the economy is 

c=/(lO — (» + m,)/*,£,) <*ip t , 1 — £ (' + ( w + "**)/£<) (5) 

\<=i «-i / 

in balanced growth. This is to be maximized with respect to ,..., a r . 

Clearly, ct { ^ 0, i = 1,..., r. From [4] it is known that a < 
1/(1 + (n + m)/£) in the basic model. By analogy. 


< i +(n + mMi ~ 8 ‘> ' - 1 . r ‘ 


( 6 ) 


It is necessary and sufficient for a maximum of (5) subject to (6) and 
nonnegativity constraints that 


where 


SR-)I8X 

8f(-)l8Y 


0 Yi 


(° 

as — e[0, S,], 

(«€ 


_ 1 + (n + md/ti 
Yi ' 1 — (n + mJ/SiCi 


i = 2,..., r. 


(7) 

( 8 ) 


Without loss of generality, suppose groups are ordered from 1 to r by 
Yt < Y*+i > * = 1.—» r — 1.® Let i* be such that 


Yi* < 


8f(-)/dX 

8f(-)fdY 


< Yi *+i • 


(9) 


* Suppose there are no ties. Two groups j and k with yt = y* could be combined for 
policy purposes, despite differences in the basic parameters n, m,, s,, {, ■ Of course, 
if these parameters are altered, this combination may no longer be valid. 



384 


R. MANNING 


Together (7) and (9) imply that 

( 8 , (<«* 

a< *= € [0, S,] as (10) 

(0 (> /* 

There is a distinct cut-off in the education of the groups. Nobody gets 
trained from the high index groups (that is, i > /*) until all are trained 
in the earlier groups. For example, prevailing social patterns may imply 
that women will give less years of skilled service than men, other things 
being equal. Women will then be trained only if their relatively high m, is 
offset by a relatively high graduation rate £<. For this reason, the average 
quality of educated women will exceed that of educated men. This fact 
does not imply that society would gain from educating more women: 
generally it will be better to train men rather than women. 

Many pressures might lead to a common student-staff ratio for all 
groups. Replacing everywhere by s yields the following ordering of 
groups for educational purposes: 

(n + m<)/£< < (n + m i+ i)/Ci+i, i = 1 ,-, r — 1. (11) 

When n — 0, the ranking in (11) depends only on m,/£,. l/£, is the 
investment of unskilled labor in training, so that m,/£, is the rate of loss 
on this investment. The groups for which this loss is low are those which 
then get priority in education. The trade-off between mortality and ability 
is most striking in this case. A doubling in the probability of dropout 
from the workforce must be matched by a halving of the expected length 
of training if the ranking in educational priority is to be maintained. 

In general, educational priorities depend on the rate of net population 
growth. It is easy to find cases in which the ordering of groups is reversed 
for different values of «, other things being equal. 4 At low rates of 
population growth the rate of loss on investment in skills is important, but 
at high rates of population growth greater emphasis must be placed on the 
groups with high graduation rates. When there is a common student-staff 
ratio, the educational ranking of a pair of groups can be reversed as 
population growth increases only if they have different graduation rates. 
One consequence of this general result is that women will get higher 
educational priority if the population growth rate is high rather than low. 

4 For example, in the case in which f 4 = s, , (10) provides the basis of the ranking. 
Let m, = 0.04, C, = 0.1 and m, — 0.05, = 0.12. Then if n — 0, group «' is ranked 

before group j, since ro 4 /( 4 = 0.4 while — 0.417. But if n = 0.02, group j is 
ranked before group /, since (n + md/k = 0.6 while (n + — 0.583. 



EDUCATIONAL POLICY 


385 


Care should be taken in interpreting the conclusion that educational 
priorities depend on population growth rates, as well as the other 
“comparative dynamic” propositions of this paper. Comparisons of two 
economies which differ in one respect or more are made. But the para¬ 
meters describing an economy are related through the age structures of the 
population and the trainees. A change in n will have an impact on mortality 
rates and graduation rates: certainly, in the short run an increase in the 
birthrate will affect both graduation and mortality rates by increasing the 
proportion of young people in the population. These same effects, some¬ 
what diluted, will also be present in the long run. 

The analysis in this section is quite general, but it does not apply to age 
based differences. It is essential for the model given here that the groups 
are well-defined, with no movement of individuals between them. This 
will not be so if age is a characteristic defining membership to a group, 
since the process of aging involves a shift from one set of characteristics 
to another. Therefore the conclusion that no unskilled elderly people 
should receive training may be justifiable, but not from this model. 

Males and Females 

Some remarks on the education of women have already been made. 

A more systematic discussion of some economic problems associated with 
educating women is now given; further comments will be made in later 
sections also. 

The model of innate differences is now specialized by supposing that 
there are two groups, males (M) and females (F), for which £ M = Cf — £ 
and s M = s F = s, while m M m F in general. The only difference between 
the groups is their mortality rates. Prevailing cultural patterns in most 
societies imply that women spend less time in the workforce than do men. 
Women have the same mortality rate as men, but social forces associated 
with the requirements of child-bearing ensure that m F > m M (it is 
tempting to refer to this difference euphonicaily as the “morality factor”). 

As a consequence of this loss due to social mores, the female part of the 
workforce is less than one-half; that is p F < % < p M . The proportion p F 
varies with m F , being one half when m F — . 

Optimality requires that no females are educated unless all males whjg^^jf; 
can be educated are so. The “morality factor” lowers females in tnfc^fi# 
priority ranking for education. The aggregative formulation makes no 
allowances for individual differences. A woman may plan not to marry, 
or have children; her private mortality rate is then the same as that of 
a man. But, in the absence of firm guarantees, this plan cannot be relied 



386 


R. MANNING 


on by others, for events may cause the plan to be modified. This element 
of risk is accentuated if dishonesty is optimal; it may be in the interests of a 
woman to become skilled even if she intends to drop out of the workforce 
for a time, because skilled workers earn more than unskilled workers. 
With complete capital markets such deception would prove self-defeating. 
In the present aggregative model no personal responsibility is assumed by 
individuals, for education is allocated on the basis of average behavior. 
For this reason women with a genuine desire for education will find it more 
difficult in this planned economy to obtain an education than in a com¬ 
petitive economy with complete capital markets. 

The present analysis concerns centrally planned educational facilities. 
Clearly, similar conclusions emerge for private employers who provide job 
training. Whatever the aims of the employer, market forces will require a 
satisfactory return on his investments, so it is rational for males to be 
preferred to females in jobs requiring periods of subsidized training. 

Educational policies which lower the priority given to female education 
in turn lower the average income of females below that of males. Whether 
marriage partners are randomly selected, or there is a tendency for 
educated males to marry educated females, on the average there is less 
opportunity cost if the female stays at home than if the male does so. 
Rational family decision making to maximize income will then imply that 
m F > m M , which is the basis of discrimination against women in educa¬ 
tional policy. Of course, a family unit need not contain only one adult 
male and one adult female, but the conclusion remains valid if the unit 
has at least one female. 

Recognizing that m F may depend on educational policy does not affect 
the general result. Training more women may perhaps lower m F (and raise 
m M ), but as long as child-bearing and child-rearing cause the mortality 
rate for females to exceed that for males it will remain optimal to train 
all males before any females. 

Female education can be advanced by a “no discrimination” rule. The 
basic model may be adapted to describe the second-best optimal policy 
which follows from this rule. 

In the context of balanced growth a suitable definition of “no 
discrimination” is that <x F — a M , although this will require the education 
of more women than men. The average mortality rate of skilled workers 
is then k(m F + m M ). It is clear that the cost of a rule against sexual 
discrimination is smaller the closer are m F and m M together. The 
second-best optimal wage differential for skill is approximately 
(2n + m F + w m )/ 2£. Consequently, if not all males are educated in the 
discriminatory, first-best, policy (so the differential for skill is approxi¬ 
mately (n + m M )/£), the effect of the “no discrimination” rule is to raise 



EDUCATIONAL POLICY 


387 


the differential; on the other hand, if some women are educated in the 
first-best policy (so the differential for skill is approximately (n + m r )/0, 
the effect of the “no discrimination’’ rule is to lower the differential. When 
males’ education is at a maximum, but no women are educated, die 
differential lies between (« + m M )/{ and (n + so it is not clear 

what the impact of a “no discrimination” rule will be. 

Many reasons cause the technologies of economies to differ. Of two 
economies with the same population, mortality, and educational 
parameters, and the same isoquant map, the welfare loss from a policy 
of “no discrimination” is smallest in the most productive, because of the 
diminishing marginal utility of consumption. Pressure to allow women 
equal educational opportunities will be more effective in wealthy than in 
poor societies, because it is rather less costly to implement such policies 
in the former. 

Alternative ways of promoting female education can be adopted. 
Provision of child-care facilities will reduce m F , and increase the female 
proportion of the workforce. But the “males first” rule is not suspended 
unless m F becomes less than or equal to m M . 

The effects of reducing m F can be seen from Fig. 1. On the unit 45° line 
the numbers of skilled and unskilled workers per capita can be represented. 
For example, if a proportion <x M of males, and no females, are skilled, 
then of the workforce is skilled: this identifies the point E with 
coordinates (I — p M oc M , p M a. M ) on the 45° line. Maintenance of this 



Fig. 1. The effects of a reduction in the female rate of dropout from the workforce 
are to raise the proportion of males who are skilled, and to lower the wage differential 
for skill. 


643/13/3-4 







388 


R. MANNING 


pattern of skills involves having (1 — (n + m M )/s£) <*mPm skilled, and 
(1 — (1 + (n + m M )/0 a M p M ) unskilled, workers per capita available for 
the production of the consumption good: this identifies a point F on the 
line CB , which is mapped by varying <x M from 0 to (defined by (6)). It is 
easily seen that the slope of CB is —(1 — (n + m M )/r£)/(l + (n + m M )/£), 
The difference between E and /"is the measure of the scale of the education 
sector; the line EF has slope 1/s, which is the staff-student ratio. With 
“m = , the line BA is mapped by allowing a F to vary from 0 to S F , 

recalling that maintaining the skill pattern uses up resources of both skilled 
and unskilled labour. The slope of BA is —(1 — (n + m f )/j£)/ 
(1 + (n + m r )/Q, which is greater than the slope of CB when m F > m M . 
Sustainable inputs per capita to consumption goods production are shown 
by ABC. The optimal pattern of skill is found where an isoquant touches 
this frontier, as at D , in which case a M * < S M and a f * — 0. 

ff m F decreases to m f ', there are two effects on the diagram. First, the 
female proportion of the workforce is increased from p f to p F \ conse¬ 
quently the break in the sustainable inputs frontier shifts to B'. Secondly, 
the segment of the frontier which involves skilled women becomes more 
negatively sloped. This gives the frontier A'B’C: the new optimal pattern 
of skills is found where an isoquant touches this. 

Two cases may be distinguished. First, B' may be not to the right of D. 
Then the reduction in female mortality will not result in any female 
education, but will result in an increase in the proportion of males who 
are skilled. Women will be used only in unskilled jobs, and will free men 
for skilled tasks. Gross output will rise, but average output, and the wage 
differential for skill, will be unchanged. Of course, this case only arises 
if not all males are educated at the high value for female mortality, m F . 
Secondly, B' may be to the right of D. The proportion of males who are 
skilled will not decrease; indeed if not all males are skilled at m F , in this 
case all are skilled at m F '. The reduction in female mortality does then 
increase female education, but also ensures that all males are educated. 
In addition, average output falls, and the differential for skill rises, when 
not all males are skilled at the high female mortality rate. If some women 
had been educated at the high mortality rate it is not clear that the 
proportion of skilled females in the workforce is higher at the low rate: 
the comparison depends on the relative skill intensities of education and 
consumption goods production, and the elasticity of factor substitution in 
consumption goods production. All that is clear is that the wage differ¬ 
ential for skill falls. 

Evidently, the effects of reducing m F on female education are equivocal, 
although it is certain that educational opportunities for males are not 
harmed by such reductions. Costly ways of reducing female mortality, such 



EDUCATIONAL POLICY 


389 


as child-care centers, may find justification in increases in total output, 
% but their impact on women’s education, and the distribution of income, 
are not necessarily beneficial. 


Variable Student-Staff Ratio 

The student-staff ratio is now regarded as a policy variable. A new 
feature which this introduces is the dependence of the graduation rate on 
the student-staff ratio. 

Let <£ be the educational production function. G ~ T), where G 

is the production of graduates, £ is the number of educators, and T is the 
number of trainees. It is natural to assume that <j> is homogeneous of degree 
one, for the replication of independent, identical, training centers (schools) 
will increase graduate production in proportion to the replication. <f> is 
assumed to be quasi-strictly-concave. Naturally, G < T, which restricts 
the class to which <f> can belong: It cannot be C.E.S., for example. 

Of course, 

i - -£=&*(-£-, l) 6(s). (12) 

The graduation rate £ is a function of the student-staff ratio x; 8 is assumed 
to be strictly convex, with 0'(x) < 0, for all .r, and with 8(0) < 1 and 
0(oo) — 0. 

The form assumed for 0 is reasonable. When a teacher has few students 
the graduation rate will be high; as the number of students of the teacher 
increases, the graduation rate will fall. This aggregative relationship is 
unlikely to hold exactly for individual teachers, who will have different 
class sizes which maximize their graduation rates: classes above or below 
these sizes will result in less success, for valuable group interactions will be 
lost if the classes are too small while large classes will be beyond the 
organizational capacities of the teachers. These considerations suggest 
that the aggregate relationship 9 can be increasing only for small ,r, if at all. 
Nothing is lost by requiring that 6'(s) < 0 for all s\ it can be shown that 
for the optimal student-staff ratio s*, d'(s*) < 0. Even if there is an s 
which maximizes £ (and this would be the optimal student-staff ratio to an 
educationalist) this should not be implemented: The failure rate in edu¬ 
cation should not be minimized. 

/ By replacing £ by 0 (j) in (3), the following necessary and sufficient 
conditions for an interior maximum can be derived: 

df(-)/8X I 4- (n + mWs*) 

8f(-)/8Y 1 - (n + m)js* ■ 9(s*) 


(13) 



390 


R. MANNING 


and 

m M/gjr _ _ ** a ; 9'{s*) (ia\ 

df(-)/8Y 0(s*) + s* ■ 8'(s*) ‘ 

Equation (13) compares directly with (4), remembering that the optimal 
student-staff ratio s* determines an optimal graduation rate £*. The 
right-hand side of (14) is the negative of the slope of an isoquant of the 
educational production function <fr, s therefore, (14) requires that the rate 
of transformation between skilled and unskilled workers be the same in 
the consumption and education sectors. 

Figure 2 illustrates the simultaneous determination of at* and s*. 



Fig. 2. The simultaneous determination of the student-staff ratio and the level 
of skill. 

The point (1 — a*, at*) on the 45° line shows the optimal per capita 
numbers of skilled and unskilled workers. This point is the origin O e of an 
Edgeworth box showing how the workers should be divided between the 
sectors. Balanced growth requires the per capita production of (n + m) a* 
graduates. An isoquant of the education sector showing the possibilities 
of doing this is shown. The optimal staff-student ratio 1/s* is found where 
a consumption good isoquant is tangential to the education isoquant. For 
this value of the student-staff ratio a locus of balanced growth input possi¬ 
bilities with slope —(1 — (n + m)/s* ■ 8(s*))/(l + (n + m)/0(s*)) can be 
defined, a* is optimal because the consumption good isoquant is tangential 
to this locus. 

* Proof. Using Euler’s theorem on <f>, write £ = 4>s0/s + 4>tH>b)- Now { = <(») = 
*1 It. 1), so 0 '(j) = -fc/** Hence, 4r/4 £ = -[(WWs)) + Us}. 





EDUCATIONAL POLICY 


391 


It follows from (14) that < 0. The graduation rate should not be 
maximized in balanced growth. While a reduction in the student-staff 
ratio would raise the graduation rate, this policy would also reduce the 
average level of consumption. 

With a variable student-staff ratio it may be that innate differences 
between groups will imply the optimality, not only of attaching different 
priorities to the education of each group, but also of varying the type of 
training each group gets. This is investigated for males and females. The 
previous specialization of the general model of innate differences to the 
case of males and females is modified by replacing £ by 9(s f ), i = M, F> 
to allow for the possibility of different student-staff ratios. Necessary 
and sufficient conditions for maximum sustainable per capita consumption 
are: 


ff(-)/ar ^ !+(/» + 

df(-)/DY @ 1 - (a + ■ 0(sS) ' 

and 


(0 

]«[0,8j(s**)] 05) 
(6 (j,*) 


df( ")IBY ' 9(s t *) + s,*0W)’ 


> only if otj* = S/jy*), (16) 


where 8 fa) = 1/(1 + (« 4- j = Af,F. In deriving (16), use 

was made of the fact that 8/(sj) < 0. 

Suppose all males who can be are educated, and that some females are 
also educated. From (16), and the properties of the educational technology, 
s F * > s M *, so that ( F * < £ m *. Females are educated under inferior 
conditions to males. Of course, this difference reinforces the educational 
priority ranking based on m F > m M , so (15) implies that ol f * — 0 unless 
<X M * = With a variable student-staff ratio, males get educational 

priority.in both the provision, and type, of education offered. 

Evidently a similar conclusion will emerge in general: innate differences 
in groups will result in differences in educational priority and type. If the 
same educational technology is appropriate for all groups, which implies 
the only difference in groups is in the mortality parameters nij, then the 
argument for the males and females case can be simply adapted to show 
that the optimal student-staff ratio increases with the mortality rate. 
People thought likely to drop out of the workforce should be given low 
quality education. The pass rates implicit in the choice of student-staff 
ratios confirm the educational priority ranking. In general, different 
educational technologies will be appropriate for each group. Optimal skill 
levels and student-staff ratios must then be determined simultaneously, and 
the priority ranking can be performed only after the problem is solved. 



392 


R. MANNING 


Education as Consumption 


Some implications of the view that education is in itself desirable are 
now derived. 

In the basic model, suppose that social welfare is «(c, a), where u is a 
quasi-strictly-concave utility function increasing in both arguments. It is 
necessary and sufficient for maximum sustainable utility that 


£/(••■) (. n f m \ £/'(■■•) /. n + m \ _ Bu{---)/ba 

BX \ si 7 ' BY \ l / bu(-)/8c ‘ 


(17) 


This requires the equivalence of the rate of transformation with the rate 
of substitution of education for consumption. 

The wage differential implied by (17) is less than that defined by (4). 
The small order of the educational golden rule wage differential was 
previously emphasized; if education is desirable in itself, even smaller wage 
differentials for skill are in order. 

When groups are differentiated in treatment, the average educational 
levels in each group may become important. In particular, both <x A , and x F 
may enter the social welfare function when males and females are educated 
to different extents. The utility function is then m(c, a M , <x F ). If u is quasi- 
strictly-concave, and education has a Leontief fixed-coefficients technology, 
then it is necessary and sufficient for a maximum sustainable utility that 


) /. n | m, \ _ £/(•••) / n f m, \ _ • • Q/gat, 

"tX V 4 / BY V ' { ! ^ 8u(--)l»c 

(° 

as <*/ -- €[(>, S ,),j =-■ M, F, (18) 

(«. 

where 8 , is defined by (6). 

Suppose that a,* e(0, 8,)J = Af, F: Then (18) implies that 8u(--)/d« M < 
du(“-)/dat F . Females are trained before all males are trained only if the 
marginal utility of female education exceeds the marginal utility of male 
education. 

In particular, if <x M * = a F * it must be true that 8u(c, «, a)/d« F > 
du(c , «, a)/0a M . If equal educational opportunities are to be given, then 
the social welfare function cannot be symmetric in a M and a F : This offends 
against some views of social justice. 

With a social welfare function which is symmetric in male and female 
education, the diminishing rate of substitution between these arguments 
will ensure that Equal educational opportunities are not then 

optimal. 



EDUCATIONAL POLICY 


393 


So, in general, innate differences in ability will lead to differences in the 
opportunities for education offered to groups, even if it is optimal on 
welfare grounds to educate some members of all groups. 

Graduate “unemployment” is possible when education is regarded as a 
consumption good. In maximizing sustainable per capita consumption it 
happens that df(—)jdX > df('")/8Y; see (4), for example. Graduates arc 
then best employed on skilled tasks. Increasing the level of skill as the 
result of concern over the educational attainments of society, or of groups 
in society, will reduce the marginal productivity of skilled workers and 
increase the marginal productivity of unskilled workers. If all graduates 
are employed on skilled jobs, as we have been assuming, it will happen that 
their marginal productivity will eventually fall below that of unskilled 
workers. An increase in output would follow from the allocation of 
trained workers to unskilled jobs. Such graduate “unemployment” is 
optimal, but creates a new range of social problems. 


Concluding Comments 

The results of this paper are unfashionable. But, if it is correct that 
educational policies can affect production possibilities, consideration 
needs to be given to the consumption, and implied utility, costs of these 
policies. Views of education which disregard the productive aspects of 
education themselves imply an attitude to the consumption of goods: 
This attitude ought to be made explicit, so it can be debated. 

Objections can be made to the fundamental premise that education 
increases the marginal productivity of workers. Arrow [1] has pointed out 
that some types of education only provide information to employers about 
individual workers’ capacities, and have a negligible effect on those 
capacities. Specialization in “filtering” is desirable, however; even if the 
most extreme view was correct it would not remove all need for educational 
facilities. Anyway, it is clear that much education does raise worker 
productivity. 

Many topics have been examined here by developing the basic model: 
Many more can be examined by concentrating on other features. 

Brief consideration was given to an economy with one physical capital 
good, and many consumption goods, all produced using physical capital, 
skilled, and unskilled labor in [4]. No difference in the educational golden 
rule followed from this generalization of the basic model. This is because 
the rate of substitution between skilled and unskilled labor is everywhere 
equated for efficient factor allocation, while the technical possibilities of 
substitution of skilled for unskilled labor are determined by population 



394 


R. MANNING 


and educational parameters in balanced growth. However, by allowing 
physical capital goods as an input into education, or by allowing physical 
capital goods in the situations of this paper, a new range of problems and 
policies emerges. 

A further, if minor, variation follows if the student-staff ratio is allowed 
for in the social welfare function: The motivation for this case is the 
common association of low student-staff ratios with high quality education. 
This raises a more important issue, which is now discussed. 

Differences in the student-staff ratio will usually result in differences 
in the quality of graduates, even if there is only one type of skill. More 
generally, there are a variety of skills, each with a particular student-staff 
ratio. If education for each skill involves using unskilled labor and skilled 
workers of the same type only, then the optimal wage differentials can be 
found from (4) using the parameters appropriate for the various occu¬ 
pations. If the education of a skilled worker involves the input of many 
kinds of skilled labor, then (4) does not apply. This is the case of Tu [12]. 
Examples of both types of skill can be found: Medicine, law, motor 
mechanics, and other specialized occupations, typically have training 
programs which involve little contact with other groups, while there are 
other jobs, like town planning, which involve many disciplines in the 
required course of study. 

Attention has recently been given to measurements of the impact of 
education on the distribution of income ([8, 9], for example). Concern 
over the distribution of income can be encompassed in the basic model by 
making social welfare a function of the consumption levels of skilled and 
unskilled workers. Educational policy will then optimally control the 
margin for skill. If the differential implied by (4) is too great, then policies 
can be adopted to reduce it. The cost of these is a loss of average con¬ 
sumption, although unskilled workers gain both an absolute and a relative 
income rise. 

Apparently the aggregative approach to educational policy has much 
to recommend it, but there are pitfalls to avoid. 

Individual aspirations are covered up by the aggregative formulation. 
Individuals who would benefit from education, but who are in low priority 
groups, as well as those who do not want education, but are in high 
priority groups, are alike forced into situations they do not want. Group 
differences have been accepted here; clearly, it is an important aspect of 
the problem to place people into the appropriate groups, and even more 
basically, to select the groups themselves. 

To get away from the aggregative approach requires fine classifications 
of people. The education of some individuals will depend on the classifi¬ 
cation scheme adopted, so it is important that this is based on precise 



EDUCATIONAL POLICY 


395 


data about relevant parameters: It should not be done on unsupported 
belief, and care must be exercised if the defining parameters are themselves 
dependent on educational policy. 

The finer the classification scheme, the higher will be average utility. 
With a very fine classification it would be possible to recognize individual 
desires with respect to education. But it is not worthwhile to move 
completely from the aggregative view of the problem, for both the 
development of the classification scheme and the placement of people into 
slots in the scheme are costly. The problems of determining the best 
classification scheme, and the manner of its implementation, have not been 
considered here. 


References 

1. K. J. Arrow, Higher education as a filter, J. Public Econ. 2 (1973), 193-216. 

2. G. Becker, Investment in human capital: A theoretical analysis, J. Polit. Econ. 70 
(1962), 9-49. 

3. A. Bottomley, Optimum levels of investment in education and economic develop¬ 
ment, Z. Gesamte Stoat. 122 (1966), 237-246. 

4. R. Manning, Optimal aggregative development of a skilled workforce. Quart. J. 
Econ. 89 (1975), 504-511. 

5. E. S. Phelps, “Golden Rules of Economic Growth: Studies of Efficient and Optimal 
Investment,” Norton, New York, 1967. 

6. A. Razin, Optimum investment in human capital. Rev. Econ. Studies 34 (1972), 
455-460. 

7. T. W. Schultz, Capital formation by education, J. Polit. Econ. 68 (1960), 671-683. 

8. J. Tinbergen, Substitution of graduate by other labour, Kyklos 27 (1974), 217-226. 

9. J. Tinbergen, Actual vs. optimal income distribution in a three-level education 
model, to appear. 

10. P. N. V. Tu, Optimal educational investment in an economic planning model, 
Canad. 7. Econ. 4 (1969), 52-64. 

11. P. N. V. Tu, The classical economists and education, Kyklos 22 (1969), 691-718. 

12. P. N. V. Tu, A multisectoral model of educational and economic planning, Metro- 
econ. 22 (1970), 207-226. 

13. H. Uzawa, Optimum technical change in a aggregative model of economic growth, 
Internat, Econ. Rev. 6 (1965), 18-31. 



JOURNAL OF ECONOMIC THEORY 13, 396-413 (1976) 


Comparative Statics of a Residential Economy with 
Several Classes** 

John Hartwick 

Department of Economics, Queen's University, Ontario, Canada 

Urs Schweizer 

Department of Economics, M.I.T., Cambridge, Massachusetts 02139 

AND 

Pravin Varaiya 

Decision and Control Sciences Group; Electronic Systems Laboratory, Department of 
Electrical Engineering and Computer Science, Cambridge, Massachusetts 02139; 

Electronics Research Laboratory, University of California, Berkeley, California 94720 

Received June 18, 1975; revised May 3, 1976 


. 1. Introduction and Summary 

The geography and economic structure of the city model under 
consideration are orthodox. The city possesses circular symmetry and at its 
center is the CBD (central business district) of radius p. People live at 
distances x > p and they commute to work in the CBD. Commuting 
cost is solely a money cost and depends only upon the distance of the 
residences from the CBD. Individuals have fixed money wages and they 
use it to occupy land (housing) and to consume other commodities. 
These latter are available at fixed prices which are constant throughout 
the city. The land market is competitive and so land rents, in equilibrium, 
coincide with the maximum of individual bid rents and an exogenously 
specified agricultural rent r A . Land is owned by absentee landlords. 

* A version of this paper was presented at a conference on Mathematical Models 
of Land Use at McMaster University, April 1975. 

t Research partially supported by National Science Foundation Grants GK-41647 
and ENG 74-01551 - AO 1; by the Swiss National Science Foundation, and by the Canada 
Council. We are grateful for helpful comments from members of the Urban Workshop, 
Department of Economics, M.I.T., Cambridge, Massachusetts 02139. 

396 


>pyright (0 1976 by Academic Press, Inc. 

) rights of reproduction in any form reserved. 



STATICS OF A RESIDENTIAL ECONOMY 


397 


There are n classes of people, class i is characterized by the number of 
people in the class N { and, for each individual in the class, his wage w< 
and utility function U t . U t is assumed to be smooth, strictly quasi- 
concave and such that housing is a normal good. Furthermore, it is 
assumed that preferences and incomes between classes are related in such 
a way that an individual in class i will occupy more land than one in class 
/' + 1. In particular, this holds if all individuals have the same utility 
function and w ( > w, +1 . 

Under these assumptions, in equilibrium, people in different classes 
reside in characteristic concentric rings around the CBD with individuals 
in class / + 1 living closer to the CBD than those in class /'. Suppose class i 
occupies the ring J, = [jc <+1 , x<] where x, > • • • > x n+1 = p, the CBD 
radius. Let i/j,..., u n denote the equilibrium utility levels. The dependence 
of these equilibrium values on the exogenous parameters can be expressed 
in a functional form, 

x, = x ,( Nx ,..., N n , w 1 ,..., w n ) i = 1 n , 

u, = u,(Ni ,•••, ,..., w„) i — 1. n. 

Our first set of results shows that the signs of the various partial 
derivatives are unambiguous. The specific statements are these. 

'dxJdNj > 0 if / .<; /, dxJdN l < 0 if 7 > /', and Suj/dN, < 0 all j, (i) 

i.e., if the /'th class increases in size, then the outer classes are pushed 
away from the CBD whereas the inner ones are squeezed towards it, and 
everyone's real income is reduced. This is not at all surprising. 

cXj /r>w, > 0 all j, and dujdw , > 0 if / 5s i, Su^dWi < 0 if j < i, (ii) 

i.e., if the /th class's income rises, then all classes are more suburbanized, 
and the outer classes suffer a reduction in their real income whereas the 
inner classes enjoy an increase. The total asymmetry of (i) and (ii) is of a 
striking simplicity. Many people may find (ii) counter-intuitive since it 
would appear at first sight that, since increases in N ( or both create 
increased demand pressures in the land market, therefore they would 
have the same impact on the real income of the other classes. 

These qualitative results, while telling us that increases in income or size 
of an outer class have opposing impacts on an inner class, do not provide 
any clue as to the relative magnitudes of these two effects. Our second set 
of results is the outcome of some numerical exercises designed to reveal 
what magnitudes of these effects might plausibly be expected. Choosing 
n = 2, and the same Cobb-Douglas utility function for both classes, 



398 


HARTW1CK, SCHWEIZER AND VARAIYA 


we show that the negative effect on the poor due to an increase in the size 
of the richer class is far greater than the positive effect due to an increase 
in the latter’s income. 

When this paper was first written, the only work related to ours was the 
comparative analysis carried out by Wheaton [2] for the case of a single 
class, n — 1. Since then two papers have appeared. Miyao [4] proves the 
simple special case of (i) and (ii),dx f l8N n > 0,8xjlBw n > 0 for j — 
and Wheaton [5] proves the result (ii) above for the case n — 2 when the 
utility function is the same for all. We are indebted to the referee for 
pointing out these references and for several suggestions. 


2. The Model and its Equilibrium 


Aside from the notation introduced already we need the following. The 
money cost of commuting from x is t(x) and is such that 


Assumption 1. 

t x = dtjdx > 0. 

Rent of land is denoted by r, and the amount of land available for housing 
at x is exogenously given to be d(x)x (for a circular city we have 8(x)x — 
2 nx.) The amount of land occupied by an individual is denoted h and the 
bundle of other goods he consumes is the (column) vector c. The 
exogenously fixed prices of these other goods is the (row) vector p. 

The expenditure function for an individual of class i can then be defined 
in the usual way as 

E*(r, u) = min {pc + rh | u‘(c, h) — u} 


and this is related to his compensated demand functions C\ H* by - 
B(r, u) = pO(r, u ) + rH\r, u); U*(0(r, u), H\r, «)) = u. 

E* has the following well-known properties: 

EJ = 8E‘l'du > 0, E r < = 8E'/8r = H* > 0, 

E' rr = ci 2 £7<9r 2 = Hf < 0. 


0 ) 

( 2 ) 


Let r*(u, I) be the function which gives the maximum rent he can offer 
and still achieve a utility level u when his disposable income (net of 
commuting costs) is I, 

r*(u, I) = Max{A~ l (/ — pc) | U*(c, h) — «}. 



STATICS OF A RESIDENTIAL ECONOMY 


399 


From this definition it follows that 

EK r *(u, /), u) = /. (3) 

Define the demand for land when r*(u, /) is the rent by 

h<(u, /) = HHrHu, /), u). (4) 

We assume that housing is a normal good, and further, if individuals 
from class i — 1 and i face the same rent then the former demands more 
housing, i.e., 

Assumption 2. 

(a) HJ = BH'j'du > 0, (5) 

(b) if for any r, xE k (r, u k ) = *’*. — t(x), k = i — 1, i, then 
H'-Hr, u^ x ) > H\r , «.). 

Note that (5) together with (2) yields 

e' tu = = e l £70 u a r > o. (6) 

Next, from (2) and (3), 

r/ - (E/)- 1 > 0, r u ( = -EJiEr*)- 1 < 0, (7) 

while, from (2), (4), (5), and (7), 

h/ = // r V/ < 0, h u * = HSrJ + HJ > 0. (8) 

Finally, since the wage of an individual of class i is w t and since t(x) is 
his commuting cost, his bid-rent function is r‘(u, Wj — t(x)), considered 
as a function of w and x. From Assumption 1 and (7), 

rj = dr*!dx = -W < 0. (9) 

The next result, which is used repeatedly, shows that the difference in 
two bid-rent functions varies strictly monotonically in x. 

Lemma 1. Suppose r — r*(u ( , w { — t(x)) — r‘(U) , w, — t(x))andi < j. 
Then 

rj(uj , W) — t(x)) < r x Hu t , w t - t(x)) <0 for x ^ x. (10) 

Proof. By (1) E k (r, u k ) = w* — f(x), k = i,j and since i < j, by 
A2(b), 


H*(r, u,) > W(f , u } ). 


01) 



400 


HARTWICK, SCHWEIZER AND VARAIYA 


From (3), £*(/■'(«<, w, — t(x)), u t ) = w t — t(x), and if we differentiate this 
with respect to x and then use (2), we obtain 

H'irKut , w, - r(x)), Uf) r x \Ui, w, - t(x» = —t x . (12) 

Similarly 

H’(r J {uj, wj - t(x)), rj(ut , w, - t(x)) = —t x . (13) 

From these relations and (11) we conclude that (10) holds for x — x. 

Next, as long as both x < x and r*(uf, w, — t(x)) < r j (uj, Wj — ?(x)), 
we can show using (2) and Assumption 2 that 

H l (r'(u t , Wt — f(x)), > H'(r>(Uj , m>, — t(x)), u t ) 

> HHr^Uj , Wj - t(x)), Uj), 


and so again using (12), (13) we can conclude that (10) holds. Using this 
fact and knowing that (10) holds at least for x = x the assertion follows 
quite readily. 

As a corollary to the lemma we have 


Corollary I . Under the assumptions of Lemma 2, 

r‘(u { , H’, — t(x)) <2 rfuj , h’ — t(x)) according as x x. (14) 

Next we define the envelope of the individual bid-rent functions, 

R(Uj ,..., u n , Wj - t(x),..., w„ - t(x)) 

= Max{r 1 (u 1 , w, — t(x)),..„ r"(u„, w„ - t(x)), r A ). 

Recall that 9(x)x dx is the amount of land available for residences in the 
ring [x, x + dx]. We can now define the notion of an equilibrium. 


Definition. An equilibrium consists of a set of utility levels w, ...., w„ 
and a set of residence rings f ,..., J„ contained in [p, oo] such that 


xe J, if and only if r'(u, , »•, — t(x)) 

— R(Uj .U n ,w,— /(x).w, — f(x)), 1 = 1,..., n, (15) 

R(Uju n ,Wj — /(x),..., h’ b — f(x)) is continuous in x. (16) 


N, 


6(x ) x 


h l (u, , H’, — t(x)) 


dx , 


07) 


Elation (15) ensures that land is occupied by the highest bidders. 
Equation (16) says that the equilibrium rent must be continuous across the 



STATICS OF A RESIDENTIAL ECONOMY 


401 


boundary between land occupied by adjacent classes, that is, no individual 
pays more than he has to. Equation (17) is simply the land market clearing 
condition. We derive some properties of an equilibrium. 

First of all from (14) and (15) we see that individuals in class i + 1 
live closer to the CBD than those in class i. Hence the rings must have the 
form 

J, = f*,+i, *,] where x, > x 2 > ••• > x B+1 = p, (18) 
so that (16) can be rewritten more directly as 

r'(u i, " j — /(*i)) = r A , 

(19) 

r l (u, , Mi — t(x,)) = r'-\u, y , MV, - t(x { )) for i --- l,...,w. 

From (19) we can solve for u x as a function of x, and a 1 !. Then, by 
induction, we can solve for n, as a function of x,,..., x 5 and w,,..., w t . 
The signs of the partial derivatives of these functions can be determined 
as follows. 

Lemma 2. 


cutjbxj < 0 and bujbw, > 0 for j < i. (20) 

Proof. We proceed by induction. Differentiating the first equation in 
(19) gives 

ruHbuJdwf) + r, 1 = 0 , r^CdufdxJ ~ t x r, 1 = 0 , 

and so, together with (7) and (9) this yields dujbwi > 0, tn^/dx, < 0. 
Hence (20) is true for i — J. 

Now suppose (20) holds for / — 1. Just as before we still have, after 
differentiating the second equation in (19) with respect to iv ( , that 

rf(u, , w ( — r(x,))(&/,/0w f ) + r,'(u t , w, — t(x,)) = 0 (21) 

so that again du,l&w { > 0; whereas differentiation with respect to x< gives 
us 

/•„■(«(, iv, — t(x t ))(duj(>x,) + rf(u, , iv, — f(x,)) 

= ^‘(w,-!, »Vi - l(x,)). (22) 

From (7), (10), (22), we conclude that dujdxi < 0. Let us also note from 
(7) and (21) that 

8uj_ _ _ r, i (u i , w, - /(x,)) __1_ 0 ... 

8w, r,*(u, , iv; - t(x { )) Ef(r i (u i , w, — t(x,)), u,) 



402 


HARTWICK, SCHWEIZ ER AND VARAIYA 


To evaluate the derivative for j < i, we again differentiate the second 
equation in (19), 

rJ(u <, w 4 - t(Xi)) 

= rJfWj, w 4 _j - l(xt)) , w,_i - t(x,)) 

- r*r\ui. j, w t -i - t(x ( )) 

x [l + £l'V‘Wi. *’<-i ~ *(*<)). u ( -,) 

(using (23)) 



Ej V' V-i , H’,_i - t(Xj)), « < _ 1 ) I 

u ( _j) J 


(using (21), (23)). 

The expression in brackets is negative because of (6), and so using (7) 
we get SuJdWf.. j > 0. Now, from (19) again, 

rJidu./Su,^) = rJ~ J 

so that dujdui^! > 0. Hence, using the induction hypothesis, we get 


gw. = 

8u t 

&Ui_ j 

<0, 

j <i 

dXj 

&u <-1 

dXj 

8Uf_ ^ 

du t 

j 

> 0, 

j <i~] 

0H' X 

du t _i 



Hence (20) holds for i, and the lemma is proved. 

From (17) and the remark preceding Lemma 2 we note that N f is 
(mathematically) determined by x i+1 x x and w,. The derivatives 
in the next lemma are with respect to this functional dependence. 


Lemma 3. 

8Ni „ 


0, j /; 


8N, 


< 0 , 


9x, ' - J ax, +1 

Proof. From (17) and (18) we obtain 
fty _ 0(*m) x, +1 


6x 


f+1 


h*{Ui , w 4 - f(x 4+1 )) 


8N f 

dW, 


< 0, j < /. 


<0, 



STATICS OF A RESIDENTIAL ECONOMY 


403 


8N, 
dx t 



6{x) x it ^ 8u, , 

[h\u { , w, - f(x))]* *“ {Ui * Wt t(x ^ dx, dx 


, #( x i) Xj 

' 1 ‘ A<(«,, w>, - t( Xi )) 


> 0 , 


by (8)'and (20). Furthermore, again using (17) and (20), we get for j < i. 


dNj 

dx, 


l 


0(x) x 


X t+l [h\u { , w< - <(Jr))] a 


. w i 


t(x))^-dx> 0. 


It only remains to prove the last assertion, which for j < i follows in the 
same way as above from (17) and (20). Finally, for j — ;, we get 


8Ni~ __ _ T*‘ 6(x) x 

8w, L i+l [h l (u, , w, — t(x))f 




Now, from (2), (4), and (8) we have 


(24) 


h i bu i , l ,■ _ E; u (/(u ,, w, - /(*)), »<) 

“ dw { ' EJirHu ,, »v, - t(x,)), u { ) 

, Eti-x ■■■) r, £,/(••■ ^.,.) I 
+ I ' EJi... x,...) V 


which is positive since each of the terms on the right is. Hence, from (24), 
dNf/dWf > 0 also. 

Lemmas 2 and 3 are crucial for the remainder of the analysis. Both 
follow from Assumption 2. However, if one is willing to assume (5), (10), 
and (18) instead of Assumption 2 then Lemmas 2 and 3 continue to hold. 
This is Miyao’s approach. However, if Assumption 2 does not hold, 
then the only way to check whether (18) holds is to explicitly compute the 
equilibrium. In the case of identical utility functions our analysis shows 
that Assumption 2(a), together with w 1 > w 2 • • ■ > w n implies Assumption 
2(b), an important special case which seems to have been overlooked by 
Miyao. 


3. Comparative Statics 

If we express the functional dependence of N, on x l+l . Xi and 

w,,..., Wj in differential form we get 

i+i i 

dN , = £ {8N,j8x,) dx, + I (dNJBw,) dw, i = 1. n, 

i-i i-i 


643/13/3*5 



404 


HARTWICK, SCHWEIZER AND VARAIYA 


or in matrix notation 


dN — A dx + a dx„^ 1 — B dw. (25) 

By Lemma 3 the vector a and the n x n matrices A, B have the sign 
pattern shown below 



'O' 


'+ - 0 ••• 

O' 


'+ o • 

• O' 

a = 

0 

, A = 


0 

, B = 

+ • • • 

■ 0 




L+. 






(26) 


Here dx = (dx 1 dx„)', dw = (dw 1 dw n )'. 

Now to obtain the results announced in the introduction we will first 
obtain the sign pattern of A~\ as well as some information about the 
relative magnitudes of some of its coefficients. With this knowledge and 
the results obtained so far we will be able to reach the desired conclusion. 

A portion of the sign pattern of A~ l can already be determined from (26). 

Lemma 4. If A has the pattern of (26), then det A > 0 and A- 1 has the 
pattern shown below 

R.+1 


9 . 



? •" ? — + 


where ? means the sign is unknown. 

Proof. See Lancaster [3, p. 119]. 

As a corollary we can see the effect of a change in the size of the inner¬ 
most class, N„ . 

Corollary 2. 

8 Xj l8N n > 0, all j; BuJdN, < 0, all j. 

Proof From (25) and (27) we can see that the vector dx = 
A~ l {dN„)e„ > 0 if dN n > 0, where e„ = (0,..., 0, 1)', thus giving the 
first pesult. The second assertion follows from the first one and Lemma 2. 











STATICS OF A RESIDENTIAL ECONOMY 


405 


To determine the effects of a change in N t , i < n, it is necessary to 
proceed somewhat circuitously. Instead of considering an exogenously 
fixed agricultural rent r A as in (19), we consider as exogenously given the 
outer radius of the city, x x . This will cause some simple changes in the 
various functional dependencies. Specifically, u x is no longer determined 
by and w x via (19). As a consequence u l becomes a function of x t ,..., x t 
and u x via (19). (In the following argument Wj w n are fixed throughout.) 
Hence N { is a function of x i+1 ,..., x x and u,, so that 

£ (BNijdXj) dxj + (dNJduj) du x — —(dNJ8x x ) dx x . 

j-2 

After dividing both sides by (dNJdx x ) (which is positive by Lemma 3) 
we can express this in matrix form, 



du x 


dx x 

A 

dx 2 

= 

0 


dx n 


6 


(28) 


The sign pattern of X can be shown, in a way similar to the proof of 
Lemma 3, to be of the form 

- 0 ••• 0 ‘ 

:+•.•. : 

| o 

L +••••+. 

and using an argument similar to that of Lancaster [3], we get 

[*,]-[}]*■• < 29 > 


Next, we observe that if we fix x t and consider only classes 2,..., n we 
are in the same situation as the one considered above (where we fixed x x 
and considered classes 1,..., «). Thus (29) stated for this situation is of the 
form 





406 


HARTWICK, SCHWEIZER AND VARAIYA 


where the last equality follows from (29). Proceeding inductively we obtain 


• dui ■ 


r-M 


dx z ■ 


■+i 


= 

• 

dx t and 


— 

■ 

- du n - 


-+■ 


■dx n . 


■ +. 


(30) 


an intuitive result, since it states that if the city boundary is shifted out¬ 
wards then all classes get more suburbanized and everyone’s welfare 
improves. We can now prove result (i) stated in the introduction. 


Theorem 1. Under Assumptions 1 and 2, 

'dXfl'dNj >0 j < i; dx f jdN t <0 j > /; and durf 'dN i < 0 allj. 
Proof. Suppose dN, = 0 j =£ i, d\Vj = 0 all j. Then (25) reads 


A dx 


0 ' 
dN { 

. 6 


By Lemma 4 and then by Lemma 2 we get 


dxj 


‘•1-' 

, 

i — 

■f 

i_ 



dx, 

_dx ;, i_ 

— 

•i 

dN { , 

du. 

= 



(31) 


The behavior of the remaining classes, i + 1,..., n, is the same as if they 
lived in a city of radius x in which changed by dx,+ x . Hence from (30), 
and (31), 


du l+1 


• 1 - 



- du n j 

— 

■+. 

dxu , - 

.1 


(32) 


The result follows from (31), (32). 

We now evaluate the effects of income changes. Once again this is 
easy to see when only the innermost class’s income, h’„ , changes. From 
Lemmas 2-4, (25), (26), it is easy to show that 


• dXi ■ 


■+1 


dui ] 



* 

= 

* 

dw n , 

: 

= 


■ dx n . 




Idu^l 




( 33 ) 



STATICS OF A RESIDENTIAL ECONOMY 


407 


Let us show that 8u„/dw„ > 0, i.e., 

du n = [+} dw„ . 

From (17) we get 


(34) 


dNn 

dw„ 


f " &(%) X /, 8lt n L n\ J Y | ®C*n) x n 

•LWV* - as: + *')* + -Ji—sr. 


since dN n = dx n+1 — 0. Since the second term on the right is positive 
by (33), this relation can hold only if there exists x e (x n+1 , x n ), where 


h u n (8u n ldw„) + V > 0, 


which implies (34) because of (8). Thus the case where only w n changes is 
completely settled. For later purposes, however, we need to determine how 
the rent at the CBD changes with w B , i.e., we need the sign of 8r n+1 l8w n 
where r„ +1 = r n (u n , w n — /(x fl+ ,)). We will show that 


8r n . n l8w„ < 0 , 

provided that the following assumption holds. 


Assumption 3. 

[d(x)x]- l (djdx)(6(x)x) > [t x {x))- l (dldx){l t {x)) all x. 

We will discuss this condition later. 1 In addition to r n+ ,, also define 

^ r A , r m = r‘(u ( , w ( - r(x, +1 )) / = 1. n. (35) 

These rents are interrelated by the formulas 


N, 


0(x l+1 ) x t , ; _ 0(x,) X, 

'A*,n) 1,1 tjx,) 1 


h Li 4x [-^f] ri(u < - - i(x)) Jx (36) 

The formula is obtained by differentiating the identity 

EHr^Ui , Wi - /(*)), Uf) = w, - t(x), 

to obtain 

blrj = —t x , 

1 Note that if 0(x) = 2n, Assumption 3 reads 1 > (x/t(x)Xd/dx) (Ax), which is the 
condition imposed in {2, p. 232], whereas if 4 is constant then Assumption 3 requires 
only that 9(x)x is increasing in x. 



408 


HARTWICK, SCHWEIZER AND VARA1YA 


so that 


N t 


r x ‘ 6(x) x 
h*(x) 


dx = — 



Ojx)x 

t*(x) 


/»*(*) dx 


which, upon integration by parts, yields (36). 2 Now differentiate (36) 


dNj ___ 0(x t+l ) r, fl _ #(*,) X j dr, 
dw n " tjx, +1 ) c)w n t z (x,) dw n 



d_ 

dx 


r Q(x) X 
t >x(x) 


} 



rHui, Wi — t(x)) dx. 


(37) 


Now in this relation (dldx)[6(x)xlt x (x)] > 0 by Assumption 3 and 
(d/dw n ) r*(ui , w, — r(x)) = r u ‘(du,ldw n ) > 0 for i < n by (33), so that 


9(*i+ 1 ) *, n dr t+l 6{x,) x t dr, 
C(*.+1) dw n ' t x (Xi) dw n ’ 


(38) 


Starting with the fact that (drjdw n ) = 0 we can proceed recursively using 
(38) to conclude that 

dr 2 ldw n < 0,..., dr n jdw n < 0. 


An argument similar to the one used in Lemma 3 shows that 

'■"On , M’n - t(x)) r,, n + r," 
vw n dw„ 

£,”(■■■ X ...) _r_1_£W„J 

h n (u n , w„ — t(x )) L£, ,’'(... x ...) 8w n \' 

In the last expression the term in square brackets is increasing in x by 
virtue of (6) so that if (d!dw„) r n (u n , w n — t(x n+1 )) = ( 8r n+1 IJ)w n ) > 0, 
then we must have (8/dw„) r n (u„, w„ — t(x)) -> 0 for x > x„ n which 
contradicts (37). Thus we have proved 

dr n+ J8w n < 0, (39) 

as desired. Equation (39) implies that the rent at the CBD decreases as the 
income of the innermost class increases. While it is obvious that the 
density at the CBD must decrease with increases in w„ , since people in 
class n will demand more land, it does not follow that the reduction in 
density will be so large that rents will decrease also. To reach this con¬ 
clusion Assumption 3 is crucial. 

1 This faigiyMa is interesting in its own right since it directly relates the population 
to the rent fhnction and transportation cost. 



STATICS OF A RESIDENTIAL ECONOMY 


409 


We will make critical use of (39) to study the effects of a change in w,, 
i < n. But first let us note that if dN — 0, dw = 0 but dx n+1 =£ 0 in (25), 
then we have Adx = —a dx n+x so that from (26) and (27) we get 


and, using Lemma 2, 



(40a) 

(40b) 


Equation (40b) is obvious; it asserts that everyone is worse off (due to 
increased transport cost) if the CBD size grows; (40a) is perhaps less 
obvious; it asserts that all classes are more suburbanized (even without 
any increase in income) if the CBD grows. 

Now consider a change in w ,, for some / < n. Setting ch t n+1 = 0, 
dN = 0 and dw, — 0 j ^ i, in (25), we see using (26) that 


' dx ," 


'O' 



’6 



+ 

. dx n . 




dw ,, 


(41) 


where exactly the first / — 1 components are zero. From (41) and (27), 
and then from Lemma 2, we can deduce that 


dx, ■ 




• du x • 






dw ,, and 

* 

= 

* 

dx { . 


■ 4 -. 


-dui_ x - 




Thus classes living further than i get more suburbanized and suffer a loss 
in real income when the income of class i increases. 

As regards the classes i + I,..., n, their allocation is the same as if they 
lived in a city of radius x i+l and the other classes did not exist at all. 
Since class i + 1 lives farthest amongst these classes we can use (30) to 
conclude that 




-F- 


m dX(+i 




' — 


dx (+ 1 , 

i 

= 


du n . 


•+. 


• dx n - 


■ 4 -- 


so that we must determine the sign of dx (+1 . 


( 42 ) 




410 


HARTWICK, SCHWHZBR AND VARAIYA 


But before we do this let us note that classes 1,..., i receive the same 
allocation as if they lived alone in a city whose CBD radius is x <+1 . In 
this sense u< is determined by jc 1+1 and w,, and we may express this as 

u t = F(x i+1 , w,) 

and since class i is the innermost among these classes we can use (40) and 
(34) respectively to conclude that 

SFI 8 x i+1 < 0, dFjdWf > 0. (43) 

To determine the sign of dx M we differentiate the equilibrium condition 
(19) 

r M (Ui+i, w’ (+l - t(x t+1 )) = r*(u ( , w t - t(x i+1 )), 
to obtain (recall that below u t+1 is a function only of x <+1 ) 


du 


i+1 


dx. 


dx 


(+i 




dx,, 


= r« l [~ 0 ~- dx t +1 + ^ dw^ + r, { dw< + rj dx t . 


8 F 


(44) 


which can be rearranged as 




du i+1 
dxu , 


+ (ri +1 - rj) - 


8 F 


dx. 


dx , 


*+1 


(+1 


Now ri +1 < 0 by (7), du i+ 1 ldx M < 0 by (42); (r ? 1 - r x <) < 0 by 
Lemma 1; —rJidFIdXi+j) < 0 by (7) and (43). Thus the coefficient 
multiplying dxt^ is negative. On the other hand, by (39), we know that 


&r {+l 

8 w t 




(45) 


Thus we have shown that 

dx (+1 = [+] dw ( . 

Only the sign of remains. From (44) we note that 


r i^-dw- - r * \— 

" dw t ' ~ ■ \.d Wi 


dwi + 



= r‘ u +1 -r~dx M + (r‘ x +l - rj) dx M 

ax i+i 

= rf 1 d Xi+t + ( r ’ +1 - rj) dx M - r/ dw { 

ax U l 

■■= l-]dwi 


(46) 



STATICS OF A RESIDENTIAL ECONOMY 


411 


whence, since rj < 0 by (7), we get 


du t = [+3 dw t . (47) 

Theorem 2. Under Assumptions 1-3, dx}(dw ( > 0 all j; and du t jdw t >0 
if j i, dUjjdWi < 0 if j < i. For i = n Assumption 3 is not needed. 

Proof. The first assertion follows from (41), (42), and (46), whereas 
(42), (46), and (47) imply that Su,jdw { > 0 if j 5? i. The remaining 
assertion has already been proved. 

The critical step in proving Theorem 2 was the establishment of (46), 
or (43), which assert that an increase in w { causes not only a sub¬ 
urbanization of the ith class but a reduction in the rent faced by the 
(i + l)st class. To reach this conclusion Assumption 3 is imposed. Indeed 
if Assumption 3 is violated, which can happen if at some distance x 
transportation cost increases very rapidly or land available for housing 
increases very slowly, then it is always possible to choose a set of para¬ 
meters Ni ,..., N n and Wj,..., w n at which the signs in (45) and (46) are 
reversed so that Theorem 2 no longer holds. 

In the course of deriving Theorem 1 and Theorem 2 we have had to 
determine how the equilibrium rent function shifts due to changes in the 
Ni and w, . Since this comparative statics result has some independent 

interest we state it separately here. Let r(x) = r(x, N ,,..., N„ , w,.w„) 

denote the equilibrium rent function. 

Theorem 3. Under Assumptions 1 and 2, 

(drjdNi)(x) > 0 x n+1 < x < , all i. 

Under Assumptions 1-3, 

(drldw<)(x) > 0, X t X < , 

and 

( dr!dw ( ){x ) < 0, x n+l <x < x i+l , all i. 


4. A Numerical Example 

We consider here a city model with two population classes. The 
parameter values chosen are adapted from [1]. There is only one other 
good besides land. Every individual has the same utility function. 


U{c, h) = c°- 7S A°'“ 



412 


HARTW1CK, SCHWEIZER AND YARAIYA 


TABLE I 


Income — 912,000.00 


Ni 

79,000 

97,000 

116,000 

149,000 

171,000 

207,000 

246,000 

Xl 

2.46 

2.57 

2.68 

2.85 

2.95 

3.10 

3.25 

X, 

1.60 

1.58 

1.55 

1.53 

1.50 

1.48 

1.45 

U, 

58.1 

57.6 

57.1 

56.2 

55.7 

54.9 

54.1 

Ux 

153.2 

151.2 

149.2 

146.2 

144.4 

141.7 

139.1 

r% 

29,400 

31,300 

33,300 

36,600 

38,800 

42,300 

46,000 

r. 

54,700 

56,700 

58,700 

62,700 

64,700 

68,700 

72,700 

C.(*,) 

0.00 

45 

89 

174 

214 

293 

368 

C,(x.) 

0.00 

39 

77 

152 

189 

260 

329 

Cx(x.) 

0.00 

152 

302 

547 

693 

930 

1161 

Cx(x i) 

0.00 

135 

265 

470 

587 

770 

942 




TABLE II 







Population A', = 200,000 





97,000 

65,000 

25,000 

11,500 

6000 

X, 


8.95 

7.21 

4.43 


3.02 

2.30 

x t 


1.6 

1.58 

1.53 


1.48 

1.45 

"t 


58.1 

57.6 

56.2 


54.9 

54.1 



1308 

853 

320 


136 

66 



29,400 

31,300 

36,600 

42,300 

46,000 

ra 


54,700 

56,700 

62,700 

68,700 

72,700 

C,(X,) 


0.00 

45 

174 


293 

368 

Cg Cxg) 


0.00 

39 

152 


260 

329 


The population of the poor class is fixed at N 2 — 135,000 and their annual 
income is fixed at w t — $5000. The unit of distance (corresponding 
roughly to 2 miles) is such that the radius of the CBD is jc 3 — 1. The 
price of the other good is $1. Transportation cost per annum per individual 
is linear in distance and is given by t(x ) — $1200 x. Agricultural rent per 
annum per unit area is $20,000. A constant proportion of the area of each 
circular ring is available for housing and it is given by 9(x) = OAtt rad. 

Table I shows how various equilibrium values change as the population 
of the rich class, N t , grows from 79,000 to 246,000. The same notation as 
before is maintained with the exception of the row labels CjCx) and C s (x). 
Cj(jc) is the compensation in dollars per annum necessary for one rich 








STATICS OF A RESIDENTIAL ECONOMY 


413 


individual at x to achieve the utility level u x = 153.2; whereas C t (x) is the 
corresponding compensation for a poor individual at x to maintain the 
utility land u t = 58.1. Thus the C<(x) is a simple measure of welfare loss. 

A comparison of these two tables reveals that the loss incurred by a 
poor individual due to a threefold increase in the population of the rich 
from 79,000 to 246,000 is the same as a sixteen-fold decrease in their 
income from $97,000 to $6,000. 


References 

1. R. M. Solow, Congestion cost and the use of land for streets, Belt J. Econ. Manage¬ 
ment (1973), 602-618. 

2. W, C. Wheaton, A comparative static analysis of urban spatial structure, J. Econ. 
Theory 4 (1974), 223-237. 

3. K. Lancaster, The scope of qualitative economics, Rev. Econ. Studies (1962), 
99-123. 

4. T. Miyao, Dynamics and comparative statics in the theory of residential location, 
J. Econ. Theory 11 (1975), 133-146. 

5. W. C. Wheaton, On the optima! distribution of income among cities, J. Urban 
Econ. 3 (1976), 31-44. 



JOURNAL OF ECONOMIC THEORY 13, 414-427 (1976) 


Choice Functions, “Rationality” Conditions, and 
Variations on the Weak Axiom of Revealed Preference 

Thomas Schwartz 

Department of Government, University of Texas at Austin, Austin, Texas 78712 
Received August 11, 1975 


Models of individual and collective choice commonly require choice 
functions to fulfill certain so-called “rationality” conditions formulated 
in terms of binary relations of “preference” and “indifference.” Arrow [1] 
proved the strongest of these conditions equivalent to a version of the 
Weak Axiom of Revealed Preference. 1 1 will prove that the other familiar 
“rationality” conditions of the same general type (all weaker, of course, 
than the one discussed by Arrow) are each equivalent to a simple variant 
of the Weak Axiom of Revealed Preference. 

These conditions of “rational” choice all require of a choice function 
that it be derivable in the usual way from an underlying binary 
“preference” relation (which means the function always “chooses” the 
most preferred of the feasible alternatives). For this to be possible, the 
preference relation must at least be a suborder (which means there are no 
cyclic preferences). The weakest condition requires nothing more than 
derivability from an underlying suborder. The strongest requires an 
underlying weak order (which means there are no cyclic or other intransitive 
preferences and no intransitivities in the corresponding “indifference” 
relation). The other conditions require an underlying semiorder, interval 
order, or strict partial order} 

The Weak Axiom of Revealed Preference and its variants are simpler 
conditions on choice functions, involving no explicit reference to binary 


1 This assumes that the choice function in question is defined for all the finite, non¬ 
empty subsets of some given set. The Weak Axiom originally was formulated for the 
case where, for some n, the choice function is defined just for those sets consisting 
each of all the n-dimensional commodity bundles satisfying a budget constraint, hence 
only for certain infinite sets. See Samuelson [8] and Houthakker [3], 

1 These “rationality" conditions have all been called in question—even the weakest— 
from both empirical and normative points of view. See, for example. May 16], Plott (7], 
Scbwartz (9, 10], Tversky [13]. 

/' 414 

Copyright (' 1976 by Academic Preaa, Inc. 

All right* of reproduction in^ftpy form reserved. 



CHOICE FUNCTIONS 


415 


relations. Roughly speaking, they require that choices not be sensitive, 
in specified ways, to changes in the set of feasible alternatives. 

The excellent paper of Sen [12], extending the treatment of Sen 
[II,Chap. I*], contains several results along the lines of this paper, 
formulated and motivated in a somewhat different way, Jamison and 
Lau [4] relate a number of the rationality conditions discussed here to 
variants of the Weak Axiom of Revealed Preference, When deducing the 
former from the latter, however, they assume instead of deducing the basic 
condition of derivability from an underlying suborder. 


1. Choice Functions. Preference and Indifference. 

Background Assumptions 

We will be examining various conditions on a function C, a so-called 
choice function, for which there exists a set V fulfilling the following three 
axioms, given the preceding definition: 

Definition. S — the family of finite, nonempty subsets of V. 

Axiom I. V is nonempty. 

Axiom II. C;S-*S. 

Axiom III. C(<x) C «for all « e S. 

Interpretation. C represents some rule, procedure, criterion or 
mechanism, or some set of tastes, values, goals or behavioral dispositions, 
that govern or might govern the choices of some agent, individual or 
collective, from finite subsets of V. Let us refer to this rule, procedure or 
whatever as R. Then for «eJ, C(a) comprises those elements of a that R 
allows an agent to choose (those elements that could be chosen by an agent 
whose choices were governed by R), given that a exhausts the feasible 
alternatives. 1 will call the elements of C(a) choosable from a. 

Corresponding to C are relations P of (strict) preference and I of 
indifference, defined thus: 

Definition. xPy iff x, y e V, x =£ y and C({x, f}) = {*}. 

Definition, xly iff x, y e V and C({x, f}) = {x, y}- 

By themselves, these definitions imply: fkd- 



416 


THOMAS SCHWARTZ 


If xPy then not yPx (asymmetry of P ). 

Not xPx (irreflexivity of P). 

If xly then ylx (symmetry of I). 

If xPy then not xly (incompatibility of P and I). 

Combined with Axioms I—III and the definition of S, the definitions of P 
and I imply: 

If x e V, xlx (reflexivity of I in V). 

If x, ye V, exactly one of the following: xPy, yPx or xly (trichotomy 
with respect to P and I in V). 

If x, ye V, either xP U Iy or yP U lx (strong connexity in V of the 
relation P u I of "preference or indifference"). 

xly iff x,ye V but neither xPy nor yPx (indifference characterizable 
as absence of preference in either direction). 

When stating and proving various properties of C, P and I, I will tacitly 
assume Axioms I—III, the definitions of S, P and I, and therewith the 
consequences just listed. 


2. “Rationality” Conditions Formulated 
in Terms of Binary Relations 

The weakest of the traditional conditions of “rational” choice says C(ct) 
is identically the set of P-undominated (most preferred) elements of a: 

Binary Choice Property (BICH). C(a) = {x e a | yPx for no ye a}. 

This is equivalent to the assumption that there exists some binary relation 
or other under which C( a) is identically the set of undotnmated elements 
of a. For it is easy to show that any such relation must be P—which is one 
reason for defining P as I did. 

Since, for every a e S, C(a) has at least one element (by Axiom II), 
BICH implies that every a. e S has at least one P-undominated element. 
This consequence is easily seen to be equivalent to: 

P-Acyclicity. Not xffxff ■ ■ • x„Px 1 [P is a suborder ]. 

We get stronger “rationality” conditions by conjoining each of the 
following six transitivity conditions to BICH: 

P-Transitivity. If xPyPz then xPz [P is a strict partial order]. 




CHOICE FUNCTIONS 


417 


PIP-1 RANSITTVITY. If xPylzPw then xPw [P uf is an Interval order], 

PIP + IP P-1 R ansiti vity . If xPylzPw then xPw, and if xlyPzPw then 
xPw [Pul is a semiorder ]. 

P + /-Transitivity. If xPyPz then xPz , and if xlylz then xlz [Pul 
is a weak order ; Pu I is transitive; for all x,y,ze V , if not xPy and not 
yPz then not xPz]. 

These conditions are equivalent to their bracketed mates, assuming, as 
always, Axioms I—III and the definitions of S, P and /. 

P + /-Transitivity is stronger than PIP + //‘/‘-Transitivity. The latter 
is stronger than /’//’-Transitivity, which in turn is stronger than 
/’-Transitivity. And, like B1CH, /*-Transitivity is stronger than P- 
Acyclicity. These logical relationships are all trivial. 

Less familiar than the other transitivity conditions, P/P-Transitivity and 
PIP 4- //’/’-Transitivity (the interval order and semiorder conditions) 
are designed to allow intransitivities due to limited powers of 
discrimination—to nonnoticeable differences adding up to noticeable ones. 
For example, we could have xlylzPx (contrary to P + /-Transitivity but 
consistent with PIP- and PIP + //’/’-Transitivity), if a nonnoticeable 
difference between x and y and a nonnoticeable difference between y and z 
added up to a noticeable difference between x and z. Assuming V is 
countable, P -f /-Transitivity (the weak order condition) holds iff P can be 
interpreted as the relation of one element of V to another when the first 
has a greater “utility” than the second; /’//’-Transitivity (the interval order 
condition) holds iff P can be interpreted as the relation of one element of 
V to another when the first has a noticeably greater utility than the second; 
and PIP + //’/’-Transitivity (the semi-order condition) holds iff P can be 
so interpreted that (1) P is the relation of one element of V to another 
when the first has a noticeably greater utility than the second, and 
(2) a utility difference noticeable at any level of satisfaction is noticeable 
at all levels. To be more precise, assuming V is countable, P 4- /-, PIP- and 
PIP 4- //’/’-Transitivity are respectively equivalent to these conditions: 

For some real-valued (“utility”) function u on V, and for all x, y e V, 
xPy iff u(x) > u(y). 

For some real-valued functions u and d on V, with d (the 
“discriminability” measure) positive, and for all x,yeV, xPy iff 
u(x) > u(y) 4- d(y). 

For some real-valued function u on V, and for all x, y e V, xPy iff 
u(x) > u(y) + 1. 

(See [5] and [2].) # 



418 


THOMAS SCHWARTZ 


3. Variations on the Weak Axiom of Revealed Preference 

Arrow [1] proved BICH and P + /-Transitivity conjointly equivalent to: 

Weak Axiom of Revealed Preference (WARP). Suppose a,/3eS, 
PC at, and C(<x) £p. Then C(at - p) = C(«) - p. 

I will prove the weaker “rationality” conditions displayed in the last 
section each equivalent to one or a pair of the following six axioms, all 
weakened versions of WARP: 

Wl. Suppose a, fie S. Then C(a) n C(l 3) = C(<x u ft) Pi a n ft. 

W2. Suppose ot, p e S, ft Q ot and C(ot) £ ft. Then C(a - p)C C(at). 

W3. Suppose ot, ft g S, ft £ at and C(oi) £ ft. Then if either C(f J) £ C( ot) or 

a* - cm -pc cm dot -p)c cm 

W4. Suppose ot, ft e S, ft Q a and C(a) £ ft. 77»en i/ C(j3) £ C(ct), 
C(<x - /J) C C(a). 

W5. Suppose ot, p e S. Then if ft C at — C(a), C(a — ft) £ C(a). 

Each of these axioms says C(a) is not sensitive, in specified ways, to 
changes in ot. 

Take two sets of alternatives, one a subset of the other. Suppose some 
choosable members of the bigger set (ot) belong to the smaller set (a — ft). 
Then they are exactly the choosable members of the smaller set, according 
to WARP. So, read from right to left, WARP says that if we alter the set of 
feasible alternatives by eliminating some members, then every non- 
eliminated choosable alternative is still choosable. Read from left to right, 
WARP says this: If we alter the set of feasible alternatives by adding 
some members, then every original choosable is still choosable if any is. 

Wl says that if we divide the set of feasible alternatives (a u ft) into two 
(possibly overlapping) subsets (a, ft), then the alternatives (if any) that are 
choosable from both subsets are precisely the common elements of the 
two subsets that are choosable from the whole set. So, read from left 
to right, Wl says that if we expand the set of feasible alternatives (from a 
to c* u ft) by adding another (overlapping) set to it, then every original 
choosable that is also choosable from the added set is choosable from the 
expanded set. Read from right to left, Wl says that if we reduce the set of 
feasible alternatives (to either of two subsets into which it has been divided, 
say from a u ft to a), then every noneliminated choosable is still choosable. 

yit2 is just the left-to-right half of WARP. It says that if we expand the 
set of feasible alternatives (from at — ft to a), then everything originally 
choosable is still choosable if any is. 



CHOICE FUNCTIONS 


419 


W3-W5 are qualified versions of W2. They, too, say that if we expand 
the set of feasible alternatives (from a — fi to a), then everything originally 
choosable (from a — fi) is still choosable (from a) if any is, provided: 

in the case of W5, that nothing added (nothing in fi) is now choosable 
(from a); 

or, in the case of W4, that not everything choosable from the added 
set (from fi) is now choosable (from a); 

or, in the case of W5, that not everything choosable from the added 
set is now choosable, unless each original alternative (in a — fi) is now 
choosable (from at) if choosable from among the original alternatives plus 
the nonchoosable members of the added set (from a. — fi plus fi — CQ 3), 
i.e., from a — CQ 8». 

W2-W5 can be verbalized also in terms of reducing rather than expanding 
the set of feasible alternatives. According to each, if we eliminate some 
feasible alternatives (reducing a to a — /3), but not all choosable 
alternatives (not all of C(a)), then nothing originally rejected is now 
choosable, i.e., everything currently choosable (from a — fi) was originally 
choosable (from fi), provided: 

in the case of W5, that nothing choosable (from a) was eliminated 
(belongs to fi)- 

or, in the case of W4, that some things choosable from the eliminated 
set (from fi) were not originally choosable (from a); 

or, in the case of W3, that some things choosable from the eliminated 
set were not originally choosable, unless each current alternative (in a — fi) 
was originally choosable (from a ) if choosable from among the current 
alternatives plus the nonchoosable members of the eliminated set (from 
a — fi plu^ fi — CQ 3), i.e., from a — C(/S')). 

These six axioms are logically related as follows: 

(i) WARP is stronger than Wl, i.e., WARP implies but is inde¬ 
pendent of W1. 

Proof that WARP implies Wl. Assume WARP; let a,/3eS; to prove that 
C(«) n C(fi) = CQx u fi) n a n 0. If either C(oc u 0) Q a - 0 or C(« U)5)C 
/3 — a, then C(a) n CQ 3) and C(a u p) n a n /S are both empty, and the 
theorem is trivial. Suppose, on the other hand, that 

C(a u fi) £ a -/3 and C(« u fi) $ fi - a. 

But 

a u fi e iS and fi — a £ a fi. 


643/13/3-6 



420 


THOMAS SCHWARTZ 


So, by WARP, if p — a e S, then 

C<« u p) - 03 - <*) = C((«UjS)-(j8- «)). 

But 

C(a U j3) - (0 - a) = C(a U j3) O a, 

and 

(a U P) — (£ — a) = a. 

Thus, if p — oteS, then C(a u /J) n a = C(a). On the other hand, if 
ft — a £S, then P — a is empty, whence <* u /9 — a, and so again 
C(a u j8) n a — C(a). Either way, then, 


Similarly, 


Hence, 


C(a U^)ni = 0(a). 

c(a u /3) n p = C(/3). 

C(a) n C(P) --- C(a u P) n a n C(a uftn)3 
— C(a u p) n a. n j8. 


To prove WARP independent of Wl, let: 


Q.E.D. 


K — {1,2,3}, 1P2P3/1 and C(K) = {1}. (1) 

Then it is routine to check that Wl holds but WARP does not. 

(ii) WARP is stronger than W2. It is obvious that WARP implies 
W2. To prove WARP independent of W2, let: 


V =■--{1,2,3}, 1P2P3, 1P3 and C(K) = {1,2}._ (2) 

(iii) Besides implying Wl and W2, WARP is implied by them, hence 
equivalent to their conjunction. 

Proof. Assume W1 and W2; let «, p e S, p Q a and C(a) <£ /9; to prove 
that C(« - p) = C(c<) - j8. By W2, C(<* - P) C C(a). But C(<x ~p)Q 
a — p. Thus, C( a. — p) Q C(a) — jS, and it suffices to show that 
C(a) -—/SC C(a — P). Suppose x s C(a) — P; to prove that x e C(cx — p). 
Then 

XE a - P, 

and, Since p Q a, we have a. = (« — P) u (p u {*}), so that 


xeC(ia-p)KJ(pKJ{x})). 



CHOICE FUNCTIONS 


421 


But 


Therefore, 


xef3u{x}. 


x e C((<* - p) u (0 u {jc})) n (<x — 0) n (0 u {*}). 

Hence, by Wl, jc e C(a — 0). Q.E.D. 

(iv) W1 and each subsequent axiom are mutually independent. 
Example (2) proves W1 independent of W2-W5, and example (1) proves 
W2-W5 independent of Wl. 

(v) W2 is stronger than W3. It is obvious that W2 implies W3. To 
prove W2 independent of W3, let: 

V — {1, 2, 3}, 11213PI and C(P) = {2,3}. 

(vi) W3 is stronger than W4. It is obvious that W3 implies W4. 
To prove W3 independent of W4, let: 

V ---- {1, 2, 3, 4} and C(V*) {jc e a. \ x < .v for all y =£ 1 in «}. 

(vii) W4 is stronger than W5. That W4 implies W5 is trivial. To 
prove W4 independent of W5, let: 

V — (1,2, 3, 4} and C(oc) {x e ac \ x 2 if 1 € a, and jc -fi 4 if 3 e af. 

To sum up: WARP is stronger than Wl, stronger than W2, and 
equivalent to their conjunction. Wl and each of W2-W5 are mutually 
independent. And W2 is stronger than W3, which is stronger than W4, 
which in turn is stronger than W5. 


4. The Equivalence Theorems 


1 will now prove the equivalence of: 
BICH 

BICH and /’-Transitivity 
BICH and /’//’-Transitivity 
BICH and PIP + //'/’-Transitivity 
BICH and P + /-Transitivity 


to Wl, 

to Wl and W5, 
to Wl and W4. 
to Wl and W3, and 
to Wl and W2, 
hence to WARP. 



422 


THOMAS SCHWARTZ 


Theorem 1. BICH is equivalent to W1. 

Proof Assume BICH; let c«,)5eS; to prove that C(a) n CQ 3) = 
C(n u )9) o a n jS. By virtue of BICH, if x e C(a) n C(/ 3), then jc e a n j3, 
yPx for no y e a, and yPx for no y e jS, whence x e a u fi and yPx for no 
j'eaUjS, so that jc e C(a u /3) n a n /3. Conversely, again by virtue of 
BICH, if x e C(a u £) n a n/8, then xea, xejS and yl“x for no 
y 6»uj 3, whence y/’x for no y e a and y/’x for no ye) 3, so that 
reCHnC(j3). 

"Now assume W1 ; let x e a e S; to prove that x e C(«) iff, for every y e a, 
not yPx. If x e C(a), then for every y e a, we have a — a u {x, >’}, and so 
x e C(a u {x, y}) n a n {x,y}, whence x e C{{x,y}) by WI, and thus not 
yPx by definition of P. Conversely, suppose yPx for no y e a; to prove, by 
induction on the cardinality of a, that xeC(a). Trivial if a — {x}. 
Otherwise, let y e a — {x}. Then 

a - {y), {x, y) e S, 


and not yPx, whence 

x e C({x, y}). 

But 


x e C(a - {y}) 


by inductive hypothesis. So 

x e C(fx,y}) n C(a ~ {y». 

Hence, by Wl, x e C((a - {y}) u {x, y}) = C(a). Q.E.D. 

Lemma. Assume BICH and P-Transitivity, and therewith P-Acyclicity. 
And suppose a e S’ and x e a — C(a). Then yPx for some y e C(a). 

Proof. By virtue of BICH, since xea — C(a), zPx for some :ea. 
So, for some n > 2, there are X!,..., x„ e a such that 

x„ = x and x,Px Ml , i = 1,2,..., n — 1. (*) 

Were there no largest such «, there would be an n larger than the 
cardinality of a such that (*) held for some x,,..., x„ e a. But then, there 
would exist i,j = 1 , 2 ,..., h such that i < j and yet x, == x f , so that 

XiPx i+1 P — Xj-xPXt , 



CHOICE FUNCTIONS 


423 


contrary to P-Acyclicity. Consequently, there is a largest n (n > 2) 
such that (*) holds for some x a x„ g a, whence zPx a for no z e a, 
and thus, by virtue of BICH, x a e C(a); and by (*) and P-Transitivity 
(applied n — 2 times), x a Px. Q.E.D. 

Theorem 2. BICH and P-Transitivity are equivalent to Wi and W5. 

Proof. Assume BICH and P-Transitivity. Then Wl holds by 
Theorem 1. To deduce W5, suppose a, /9eS, 0 Q a — C(a), and 
x g C(a — /3); I will show that x g C(a). By virtue of BICH, yPx for no 
yea — j9, and so it suffices to show that yPx for aoye/3. By the Lemma, 
since /3 Q a — C(a), if y e (3 and yPx, then zPy for some z e C( a), whence 
zPx by P-Transitivity. But since ft Q a — C(a), C(a) C»~j3, and thus 
zPx for no z e C(a). Hence, yPx for no y e fi. 

Now assume Wl and W5. Then BICH holds by Theorem 1. To deduce 
P-Transitivity, suppose xPyPz. Then by virtue of BICH, y $ C({x, y, z}) 
and 2 $ C({x, y, r}), whence C({x, y, 2 }) = {x}. Since xPy, x # y, so that 
{y} C {x, y, 2 } - C({x, y, z}), and thus, by W5, 

C({x, y, 2 } - {y}) C C({x, y, r}) = {x}. 

Since yPz, y ■/= z. Therefore, since x =£ y, {x, y, 2 } — {y} — {x, r}. So 

C({x, 2 }) = {x}. 

But since xPyPz , x # 2 . Hence, by definition of P, xPz. Q.E.D. 

Theorem 3. BICH and PIP-Transitivity are equivalent to Wl and W4. 

Proof. Assume BICH and P/P-Transitivity, and therewith P- 
Transitivity. Then Wl holds by Theorem 1. To deduce W4, suppose 
a, £ g S, p <£ a, C(a) <£ j3, C(|S) £ C(a), and x g C(a — |8); I will show that 
x e C(a). By virtue of BICH, since x e C(a — /S), 

yPx for no ye a — /?, (*) 

and it suffices to show that yPx for no ye p. Suppose, on the contrary, 
that y e and yPx; I will deduce a contradiction. Either y e C(j3) or 
y e — C(/3). Even in the latter case, the Lemma implies that there is a 
z g C(/?) such that zPy, whence zPx by P-Transitivity. So in either case, 
there is a 2 such that 


2 e C(j 8) and zPx. 



424 


THOMAS SCHWARTZ 


But since CO) £ C(a), there is a tv such that 

tv e C(j8) but tv $ C(a). 
By BICH, since z,we C(f 3), 


tv/z. 

And by the Lemma, since tv f C(a), there is a t> e C(a) such that 

vPw. 

By BICH, since tv e CO), v $ fi, whence v e <* — /9, and so, by (*), not vPjr. 
But by P/P-Transitivity, since vPwIzPx , we have lPjc, hence a contra¬ 
diction. 

Now assume W1 and W4. Then BICH holds by Theorem I, and, since 
W4 implies W5, P-Transitivity holds by Theorem 2. To deduce P/P- 
Transitivity, suppose xPylzPw. Then by P-Transitivity, if zPx, zPy, which 
is impossible because ylz. Thus, not zPx. But if wPx, then zPx by P- 
Transitivity. So not ivPjc. And since xPy , not yPx. So vPx for no 
ve{x,y,z,w}, whence x e C({x, y, z, tv)) by BICH. But since xPylz, 
*£{y, z). Hence, 


C{{x, y, z, tv})£{y, z}. 

By BICH, since xPy, y $ C({*, y, z, tv}). But since ylz, C({y, z}) = {y, z}. 

Hence, 

C({y,z})£C({*,y,z,tv}). 


Therefore, by W4, 

C(0, y, z, tv} - {y, z }) C C({x, y, z, tv}). 

Because ylzPw, tv $ {y, z}. But x $ {y, z}. So {x, y, z, tv} — {y, z} = {x, tv}. 
Consequently, 

C({x, tv}) C C({x, y, z, tv}). 

By virtue of BrCH, since zPw, tv $ C({x, y, z, tv}), whence tv f C({x, tv}), 
and so 

C({x, tv}) = {*}. 

And since zPtv but not zPx, tv ^ x. Hence, by definition of P, xPw. 

Q.E.D. 



CHOICE FUNCTIONS 


425 


Theorem 4. BICH and PIP -f IPP-Transitivity are equivalent to W1 
and W3. 

Proof. Assume BICH and PIP 4- IPP-T ransitivity, and therewith 
P/P-Transitivity and P-Transitivity. Then W1 holds by Theorem I, and 
W4 holds by Theorem 3. To deduce W3, suppose a, /3 e S, f3 Q a, C(a) <£ J3, 
and either C(j 3) £ C(a) or C(a - CQ 3» - j5 £ C(ix). Then if CQ3) £ C(«), 
C(a — j3) £ C(a) by W4. Suppose, on the other hand, that C(a — C(j3)) — 
fi C C(a). Then to show that C(<* — fS) £ C(a), it suffices to show that 
C(oi — £) £ C(a — C(j8)). Let x e C(a — j3); to prove that x g C(oe — C(j3)). 
By virtue of BICH, 

yPx for no ye a — 1 3, (*) 

and it suffices to show that yPx for no y e (3 — C(/3). Let yefl- C(/3); 
to prove that not >>Px. By the Lemma, there is a z e C(/3) such that 

zPy. 

Since C(a) C there is a w such that 

w e C(a) — j3. 

Then by virtue of BICH, not zPw. So either wPz or wlz. Assuming wPz, 
we have wPy by P-Transitivity, and thus if yPx then wPx. On the other 
hand, assuming wlz, if yPx then wlzPyPx, whence wPx by virtue of 
PIP + /PP-Transitivity. Therefore, in either case, if yPx then wPx. But 
since w e C(<x ) — not wPx by (*). Hence, not yPx. 

Now assume W1 and W3. Then BICH holds by Theorem 1, and, since 
W3 implies W4, P/P-Transitivity holds by Theorem 3; hence, P- 
Transitivity holds; and it suffices to prove that, for all x, y, z, w, if xlyPzPw 
then xPw. Suppose xlyPzPw. Then if x = y, we have xPzPw, whence 
xPw by P-Transitivity, and the theorem is proved. Suppose, on the other 
hand, that x # y. Then it suffices to show that w$ C({x, w}). By BICH, 
since zPw, w $ C({x, y, z, w}). So it suffices to show that C({x, w}) Q 
C({x, y, z, w}). Since ylx but yPz , x 4 z. And since yPzPw, we have 
w 4 x, as well as yPw by P-Transitivity, whence w 4 y. So neither x nor w 
belongs to {y,z}. Consequently, {x,w} — {x,y,z,w} — {y,z}, and it 
suffices to show that C({x, y, z, w} — {y, z}) C C({x, y, z, w}). Therefore, 
by W4, it suffices to show that 

C({x,y, z, w})£ {y, z} (a) 

and 

C({x, y, z, w} - C({y, z}» — {y,z}C C({x, y, z, w}). (b) 



426 


THOMAS SCHWARTZ 


Since xly, not yPx. If sPx then yPzPx, whence, by ^-Transitivity, yPx, 
which is absurd. So not zPx. If wPx, then zPwPx, whence, by P- 
Transitivity, zPx, which is absurd. So not wPx. Hence, nothing in 
{x, y, z, w} bears P to x, and thus, by BICH, 

x e C({x, y, z, w}). (**) 

But x # y by assumption. And x f z since yPz. Therefore, x${y, zj. 
Consequently, by (**), (a) holds, and it suffices to establish (b). Since yPz, 
C({y, z}) = {y}. So it suffices to show that 

C({x, y , 2 , w} - { y}) — { y, 2 } C C({x, y, 2 , w}>. (b') 

Since yPz, y ¥- z, whence z e {x, y, z, w} — {y}. But zPw. So, by BICH, 
w i C({x, y, z, tv} — {y}) — { y, z}. But neither y nor z belongs to this set. 
So at most x belongs, and thus, by (**), (b') holds. Q.E.D. 

Theorem 5. BICH and P -f- I-Transitivity are equivalent to W1 and W2, 
hence to WARP. 

Proof. Assume BICH and P + /-Transitivity. Then W1 holds by 
Theorem 1. To deduce W2, suppose a, j8 6 5, PQot, C( a) £ j8, and 
x e C(<x — j3); I will show that x € C(cx). By the Lemma, it suffices to show 
that nothing in C(<x) bears P to x. But, by virtue of BICH, since 
x e C(ol — p), nothing in « — p bears P to x. Hence, it suffices to show 
that nothing in C( a) n p bears P to x. Let y e C(a) n P; to prove that not 
yPx. Since C(a) <£ p, there exists a 2 e C(a) — p. But by BICH, since 
y e C( a), ylz, and since nothing in a — P bears P to x, zlx, whence ylx by 
P + /-Transitivity, and thus not yPx. 

Now assume W1 and W2. Then BTCH holds by Theorem 1. And since 
W2 implies W5, P-Transitivity holds by Theorem 2. So it suffices to show 
that I is transitive. Suppose, on the contrary, that / were not transitive; 
then there would exist x, y, z such that xlylzPx, whence I will deduce a 
contradiction. Since zPx, z ^ x, and sinee zPx but ylx, z ^ y. So z is 
neither x nor y. But C({x, y, 2 }) = {y, 2 } by virtue of BICH. So 
C({x, y, z}) £ {z}, and thus, by W2, 

C({x, y, z) - {z}) £ C({x, y, z }). 

By BICH, since x z, xe C({x, y, z} — {z}). Hence, x e C({x, y, z}). But 
that is impossible, by virtue of BICH, because zPx. 



CHOICE FUNCTIONS 


427 


References 

1. K.J. Arrow, Rational choice functions and orderings. Economica 26 (1939). 

2. P. C. Fhhburn, Intransitive indifference with unequal' indifference intervals, 
J. Math. Psychology 7 (1970). 

3. H. S. Houthakker, Revealed preference and the utility function, Economica 17 
(1950). 

4. D. T. Jamison and L. J. Lau, Semiorders and the theory of choice, Econometrica 41 
0973). 

3. R. D. Luce, Semiorders and a theory of utility discrimination, Econometrica 24 
(1956). 

6. K. O. May, Intransitivity, utility, and the aggregation of preference patterns, 
Econometrica 22 (1954). 

7. C. R. Plott, Path independence, rationality, and social choice, Econometrica 41 
(1973). 

8. P. A. Samuelson, A note on the pure theory of consumer’s behavior, Econometrica 
5 (1938). 

9. T. Schwartz, On the possibility of rational policy evaluation, Theory and Decision 
1 (1970). 

10. T. Schwartz, Rationality and the myth of the maximum. Nous 6 (1972). 

11. A. K. Sen, “Collective Choice and Social Welfare,” Holden-Day, San Francisco, 
Calif., 1970. 

12. A. K. Sen, Choice functions and revealed preference, Rev. Econ. Studies 38 (1971). 

13. A. Tversky, Intransitivity of preferences, Psychological Rev. 76 (1969). 



JOURNAL OF ECONOMIC THEORY 13, 428-438 (1976) 


Increasing Returns to Scale and Productive Systems 

E. Dierker 

Universitiit Bonn, DFC-Sonderforschungsbereich 21, 53 Bonn, Germany 

C . Fourgeaud 

Universite de Paris I, Sciences £conomiques, Paris Ve, France 
AND 

W. Neuefeind 

Vniversitat Bonn , DFG-Sonderforschungsbereich 21, 53 Bonn, Germany 
Received November 8, 1974; revised April 22, 1976 


We consider an economy in which some producers may have nonconvex 
technologies, in particular they may operate under increasing returns to scale. 
These producers constitute the I-part of the economy. The prices of all goods 
and the target supply of the I-part are supposed to be given. Each producer in 
the 1-part produces a certain output in a cost-minimizing way. There is a planning 
board which has to find decentralizing (gross) output levels for the producers 
in the 1-part such that the net output equals target supply. The existence of 
such output levels is proven under a productivity assumption which includes 
the case of strongly increasing returns to scale as well as productive Leontief 
systems. 


1. Introduction 

Imagine an economy where some producers have a technology with 
increasing returns to scale. As is well known, the production decisions 
in this economy cannot be decentralized in the way that prices are set and 
producers react by maximizing profits. Consequently, one has to look for 
a different way to coordinate production decisions. 

To fix ideas, consider an example. Think of some types of, say, 
standardized, prefabricated houses and of some types of trucks, which 
can be produced with considerable economies of scale. On the one hand, 
the government is interested in the production of these goods, e.g., to 
make them available as inputs for certain industries or to satisfy the basic 
needs of large parts of the country's population. On the other hand, it 

428 

Copyright <£) 1976 by Academic Press, Inc. 

All rights of reproduction"lh any form reserved. 



INCREASING RETURNS 


429 


seems desirable that production decisions are fairly decentralized. The 
government plans how many units of each standardized type of houses 
or of trucks shall be made available by the construction resp. automobile 
industry. These quantities are sold to the consumers and to the other 
industries on the market. 

As is familiar in the theory of Leontief systems we ask the following 
question. Given a fixed net production target of houses and of trucks, 
which activities in both industries will make the target supply actually 
available to the other economic agents? 

Interested in decentralization, the government lets producers decide 
upon which inputs are used in each industry. These industries, however, 
are interdependent; a thriving housing industry, for instance, requires 
quite a few trucks to be supplied, in addition to other demand. 

We assume that the prices for all goods, houses, trucks, labor, etc., 
are given. In the context of Leontief systems this assumption is 
unnecessary, because in that particular case substitution plays no role. 
In this article we deal with the question of how to arrange the production 
of a fixed net supply at fixed prices. 

In the sequel we shall distinguish between two groups of commodities: 
those which can be produced with increasing returns to scale and those 
which cannot. A commodity in the first group is an “1-good," a commodity 
in the second group is a “ C-good (The letter I recalls “increasing,” 
the letter C recalls “convex.) All producers of 1-goods, the “1-producers," 
are placed under the supervision of a planning board. They constitute 
the “ I-part ” of the economy. The target supply, which is set by the 
government, i.e., given exogeneously, is a list that specifies the amount 
of every I-good to be made available by the I-part to agents outside the 
I-part. The quantities of C-goods used for the 1-part’s production result 
from decisions within the I-part. 

Now we formulate more precisely the task of the planning board in the 
present model. Assume that the target supply (of the I-goods) and the 
prices of all commodities are fixed. The planning board must provide 
the productive units in the I-part with decision parameters such that the 
induced reactions (production decisions) match the target, i.e., the aggre¬ 
gate net supply of 1-goods equals the target supply. 

For that purpose the planning board announces the gross output of 
I-goods which every I-producer shall produce. Each I-producer then 
chooses an input combination which minimizes costs at the prevailing 
price system and which yields the prescribed gross output. Thus there are 
two types of decision variables: the prices, which are common for all 
I-producers and the gross output levels, which affect each I-producer 
individually. 



430 


DIERKER, FOURGEAUD AND NEUEFEIND 


Organizing production in this way need not lead to an efficient 
production schedule within the I-part. The set of production schedules 
yielding the target supply need not be compact. Hence, for any production 
schedule, there may be another one whose total costs of producing the 
target are lower. For the discussion of Pareto-efficiency in nonconvex 
economies, see, e.g., [3, 4], 

We shall prove the following result under a strong increasing returns 
to scale assumption: For each price system and for each target supply, 
there exist gross output levels such that net output of 1-goods equals 
target supply, if the production of I-goods is organized as explained 
above. 

The assumption of strongly increasing returns to scale is unduly 
restrictive. What we actually use is some kind of productivity assumption 
on the I-part. The productivity of the I-part depends on the asymptotic 
behavior of its supply. In case of a classical Leontief system our produc¬ 
tivity assumption coincides with the usual one. It is, however, more 
general: It does not require constant returns to scale and it allows for 
substitution. For instance, it is satisfied if every 1-producer operates 
with strongly increasing returns to scale. Furthermore, it generalizes the 
productivity assumption used by Sandberg [5] in a nonlinear input-output 
model. 

The paper is organized as follows. The next section contains the 
assumptions about the I-part in the case of strongly increasing returns 
to scale. Each assumption is followed by a short discussion. An existence 
theorem is also stated in Section 2. Section 3 introduces the more general 
productivity assumption into the model. The proofs are in Section 4. 


2. Increasing Returns to Scale 

There is a finite number of different commodities. The set of commodities 
is partitioned into two subsets, the set of “I-goods” and the set of 
“C-goods.” I-goods can be produced with increasing returns to scale, 
whereas C-goods cannot. Producers of at least one I-good are called 
I-producers. The other producers are called C-producers. Let n be the 
number of I-producers. 

Let { be the number of I-goods and k the number of C-goods. The 
commodity space is R' r+k . The fixed target supply to be provided by the 
I-part is described by a nonnegative vector b e W. 

We shall frequently use the following notation: for a commodity 
bundle z e R* +fc we denote by z |j e R' (z | c e R*) the projection of z onto 
the ^-dimensional (fc-dimensional) space corresponding to the I-goods 



INCREASING RETURNS 


431 


(C-goods). The same notation will be used for sets. Further, for a vector 
z\ — (X, i Itii+k) £ R' + * the vector z + e R* + * denotes the positive 
part of z; i.e., the j'th component of z + equals max{£,, 0}. The negative 
part z.. of z is defined in a similar way. If z is a production program, z + 
describes the output part, z_ the input part of z. 

We assume: 

(Al) Each I-good is produced by exactly one producer. 

By this assumption we do not mean that each I-good is produced by 
only one technological unit. The producer of an I-good can be a whole 
branch of industry rather than an individual factory. Electricity, for 
instance, is produced in several power plants which we shall regard as a 
single, big producer for the purposes of this paper. Assumption (Al) 
is a conceptual simplification rather than a technical necessity to prove 
the existence of decentralizing gross output levels. 

Whereas we do not distinguish between individual factories of one 
producer we shall distinguish between different products of a producer. 
Since n s$ the set of I-goods can be partitioned into the n subsets 
corresponding to the n I-producers. Let be the number of I-goods 
provided by producer ft. Then XLi = (. A vector a e R/ of activity 
levels (i.e.. gross output levels) for the I-part is accordingly partitioned as 
follows: 

a = (a, ,..., a„ a n ), 

where a h contains the activity levels of I-producer ft. For an 1-producer ft, 
let Z h CH /U denote the set of technologically feasible production plans 
that correspond to activity levels a„ ^ 0. This definition differs slightly 
from the usual definition of a production set as it also takes into account 
that a n 2? 0. For given activity levels a e M_/, we define Z h (a) £ 
as the subset of production plans in ft’s production set Z h which yields 
the output vector a h . Formally, 

Zfc(a) := Z h r\pr h \a h ), 

where pr h : R'+* r —► R <, » is the projection onto the space of activities of 
producer ft. The set Z h (a) depends on a h only; we do not treat external 
effects. The production set Z k is the image of the correspondence 

Z h \ R/ — R m 

which assigns the set Z h {a) to a e R./. 

(A2) For ft = 1,..., n and for each a e R/, the set Z h (a) is nonempty 
and convex. 



432 


D1ERKER, FOURGEAUD AND NEUEFEIND 


This assumption says that each nonnegative vector of activity levels is 
feasible as far as producer h is concerned; i.e., he can supply any a h e RA 
if enough inputs are available. Furthermore, the production plans that 
yield a h form a convex set. 

The following assumption is essentially a conceptual simplification. 

(A3) No [-producer h provides C-goods, i.e., Z h (a) | c C R_* for all 
a e R/. 

This assumption is unnecessarily strong for our purposes. We make it 
in order to avoid clumsy notation. The assumption we actually need is 
that the amounts of C-goods produced by h remain small if a h gets large. 

The following two assumptions are of mathematical character: 

(A4) For h —- 1,..., n the correspondence Z h : R./ —► R'+* has a closed 
graph. 

(A5) For h = 1the correspondence Z„: R_/ —► R' ,f ^ is lower 
hemicontinuous (l.h.c.). 

Assumption (A4) amounts to the standard assumption that the production 
sets Z,, are closed in R /+ *. (A5) excludes the saw-tooth phenomenon of 
Fig. 1, but it allows ithe situation in Fig. 2. In Fig. 1 a small output 
reduction at a saw-tooth is possible only if many more inputs are used, 
a highly unrealistic situation. Note that part of the lower hemicontinuity 
of the correspondence Z h is embodied in the following assumption. 




(A6) For h — 1. n there exists a number y h < 1 such that z e Z h (a) 

implies (A • z + + A y * • z_) e Z h {A ■ a) for all A ^ 1. 

This assumption makes precise what we understand by strongly increasing 
returns to scale. As already pointed out, it is more restrictive than we need 



INCREASING RETURNS 


433 


it to be for our purposes. We state it, however, because a set of firms all 
operating with strongiy increasing returns to scale presents an important 
example of a productive system and was our motivation to study such 
systems. 

We consider an exogeneously given strictly positive price system p and 
the following kind of equilibrium within the I-part. 

Definition. For fixed prices p > 0 and fixed target supply b e R/ 
the tuple {a, z 1 z n ) e R,'' x (IFK +A )” is an 1-equilibrium for ( p, b ) if 

(i) for each h — 1 ,..., n the production plan z h is a costminimizing 
element of Z h (a); i.e., z h e Z h {a) and pz h — max pZ h (a), 

(ii) the I-part’s (net) supply of I-goods equals the target supply b: 

I Zh 11 - b. 

Definition. Given p ;> 0 and b e R./, an activity vector a e R./ is 
decentralizing if there exist vectors z,..... z„ e U^ k such that (a, z,z n ) 
is an J-equilibrium for ( p, b). 

Theorem 1. Assume (Al) to (A6). Then for any price system p 0 and 
any target supply b e R/, there exists a decentralizing activity vector 
a 6 R/. 


3. Asymptotic Productivity 

Our proof of Theorem 1 is based on the following observation. 
Assumption (A6) of strongly increasing returns to scale implies that the 
1-part is productive in the following sense. Let ce R/, a ^ 0, be any 
vector of activity levels. By increasing the scale of operation to A a, 
A large, one can achieve that the net output exceeds the target supply b 
for at least one l-good. A fixed point argument then shows that the net 
output is exactly equal to b, if the 1-part operates at appropriate intensities. 

A set of firms, however, can be productive in the above sense without 
exhibiting increasing returns to scale. Therefore, we shall replace the 
concept of an I-part as described in Section 2 by a more general concept. 
As before we distinguish £ goods, which are called 1-goods. Instead of (A6) 
the production of I-goods has to fulfil a productivity assumption, (A7), 
which we are going to define. The terminology is as before, I-goods are 
produced by I-producers, etc. 



434 


DIERKER, FOURGEAUD AND NEUEFEIND 


For a better understanding of the productivity assumption, let us briefly 
look at the classical Leontief system (see Fig. 3). In that system producer h 
(h = I,..., n = () produces just one good, h. If we neglect commodities 
not produced within the system (e.g., human labor) then h's production 
set is simply a ray: 


Z h : = {A ■ c„ | A > 0}, where c h e R*. 


The system is productive iff 

W/ C £ Z k . 

h-1 

This condition amounts to the following: 

(0 (ill z h ) n UJ - {0}, and 

(ii) the cones Z t . Z ( are positively semi-independent. (This 

property is defined in [1, p. 22].) 



Productivity in nonlinear models depends on the asymptotic shapes of 
the technologies or on the asymptotic behavior of the supply of the 
producers. Hence it has to be formulated in these terms. For that purpose 
we first define the supply in our framework and state some useful 
properties. Note that prices are fixed in the model presented here. 




INCREASING RETURNS 


435 


Definition. The individual supply of I-producer h at price system 
p 0, Up, •): R/ -*■ R'+* is defined by 

a Up, a) :={zjze Z K (a\ pz > pZ h (a)}. 

The aggregate (net) supply of the I-part at p is 

£(P, ):= I Up, ■)■ 

Note that we do not use (A6) in the following lemma. 

Lemma. Assume (Al) to (A5). Then, for all a 6 R/, the set Up, a ) 
nonempty, compact, and convex. Furthermore, Up, ) Is an upper hemi- 
continuous ( u.h.c .) correspondence. Aggregate supply £(p, ■) has the same 
properties. 

For the sequel, assume (Al) to (A5). For h — 1,..., n and p 0 define 
A „(p), the asymptotic cone for Up, ') li : 

A„(p) := f) cl cone [ (J Up,a) III. 

where the operation “c/ cone" means “closed cone with vertex 0.” The 
asymptotic cones lie in R' since the productivity of the I-part refers to 
I-goods only. Then the productivity assumption reads as follows: 

(A7) The cones Aj( p),..., A „(p) satisfy 

(0 (ZLi A*(») n R_f = {0}, and 

(ii)' Ai ip),..., A „(p) are positively semi-independent. 

The next proposition shows that this productivity assumption generalizes 
the assumption of strongly increasing returns to scale. Furthermore, 
one can show that Sandberg’s [5] productivity conditions are covered 
by ours. 

Proposition. Assume (Al) to (A5). If { A6) holds then (A7) for each 
price system p 'f> 0. 

Moreover, we shall show that A*(/?) equals the positive orthant in the 
^-dimensional subspace of R^ which describes the output of producer h. 
That means that A „{p) is independent of p 0 if (A6) holds. This brings 
the case of strongly increasing returns to scale close to the Leontief case. 


64*/13/3-7 



436 


DIERKER, FOURGEAUD AND NEUEFE1ND 


Theorem 2. Assume (Al) to (A5), and (AT). Then, for any price system 
p 0 and for any target supply b e R/, there exists a decentralizing 
activity vector a e R_/. 

Note that the assumptions of Theorem 2 do not exclude constant or 
even decreasing returns to scale for the individual I-producers. The 
important property is the productivity of the I-part as a whole. According 
to the proposition, Theorem 2 implies Theorem 1. 


4. Proofs 

Proof of the Lemma. The set £*(/>, a) is bounded above because of (A3). 
By (A2) and (A4), Z h (a) is nonempty, closed, and convex. Hence, because 
of p 0, the set i h (p, a) has the same properties. Moreover, £ h (p, a) is 
bounded. The proof of the upper hemicontinuity is a routine matter that 
we skip. 

All these properties are preserved under addition. Q.E.D. 

Proof of the Proposition. Let p ;> 0 be given. We shall show that the 
costs-sales ratio in the 1-part approaches zero, if the scale tends to infinity. 

Define 


C h {p) : — {a e R/ 1 value of a h at prices p equals 1}. 

Since £*(/>, ■) is u.h.c. and compact-valued there exists 
Vh(P) : = a rnin )P(^(p, a))_ > -co. 

For a e C h (p), the costs of producer h caused by the production of A • a 
are p(£ h (p, A • a))_ . The sales of producer h are A. Because returns to 
scale are strongly increasing, we have for every a e C h (p) and for every 
A > 1: - 

(1/A) -P(UP, A • «))- ^ (1/A) • A v * • r, h (p) = A <va-1> • ^(p). 

Therefore, the costs-sales ratio is bounded below uniformly in a by a 
bound which tends to zero if the scale tends to infinity. Since A’s costs 
caused by the use of I-goods only cannot exceed K s total costs and since 
p 0, no point in A „(p) has a negative component. Therefore, A h (p) 
equals the positive orthant of the ^-dimensional subspace of R' corre¬ 
sponding to the outputs of producer h. Q.E.D. 

» Proof of Theorem 2. Let p 0 be given. 

(a) First we show that for large activity levels a the aggregate net 
supply Up, a) of the I-part must be large for at least one I-good. 



INCREASING RETURNS 


437 


To see this, consider a sequence (a 0 ),** of activity levels with their 
norm tending to infinity. The corresponding sequence of normalized 
activity levels (er°/lS ef ||) agN has a converging subsequence with indices in 
Wx C W. For < 7 e Mj, choose x h Q e U(P, cP) |i and let x 9 : = y*_, x h 9 e 
Up, a 9 ) li . For each h, part (i) of the productivity assumption (A7) in 
particular excludes “asymptotic disposal activities” in A a ( p), i.e., 
A,.(/») n 03/ = {0}. From that we deduce that the sequence ((1/|| a 9 IDx/),,^ 
is bounded below. Obviously, it is bounded above. We select a subsequence 
Nj, of Nj such that s h := lim oeNi (l/|| a 9 ||) x h 9 exists for all h = 1 ,..., n. 

If s n / 0, then (|| a h Q ||) #eN cannot be bounded. Otherwise Cx/)^ 
would be bounded, which contradicts s h 0. Thus s h e A h (p) for all 
h = 1,..., n. 

We want to show that (x% eN is not bounded above. Assume the 
contrary. Then £"-i s„ —lim oe ^ i (l/H a Q ||) x 9 < 0 and part (i) of the 
productivity assumption (A7) yields T?,, s A — 0. Hence, from part (ii) 
we conclude that s k = 0 for each h ~ 1,..., n. On the other hand, 
lim, e M (a a /|| a q ||) has a positive component, say the first. This is a contra¬ 
diction, since the first component of a, 4 and that of x/ are equal. 

(b) Let the target supply b e 55/ be given. We apply a fixed point 
argument to show that beUp, a) for at least one a e R/. (There is no 
hope for uniqueness in the present model.) The argument runs as follows: 
Part (a) of the proof implies that there is c e R/, c 0, such that every 
point in £( p, a ) |, exceeds b in at least one component unless a </ c. 
Define 

A {a e R/ | a < c}. 


Let r: R/ -*• A be the retraction which maps a point x e R/\A to A • x, 
where A =-- max{A' e R | A' • x e A}. Since a ^ UP, a) Ii for all a e R/, 
we have' a + b — Up, a ) Ii ^ 0- This allows us to define the corre¬ 
spondence 


by 


cp\ A —► A 

a r-*- r(a + b — Up, a) |,). 


As Up, ) is a u.h.c. correspondence with compact and convex values, 
<p is a u.h.c. correspondence with compact and acyclic values. 

According to the fixed point theorem of Eilenberg and Montgomery 
[2] there is a* e A such that 

a* e r(a* + b - Up, a*) Ii). 

If not a* </ c, then, by the definition of c, every point in b — Up, a*) Ii 



438 


DIERKER, FOURGEAUD AND NEUEFE1ND 


has a negative component. This contradicts a* e r(a* + b — tip, a*) |,) 
since r(x) < x for all x e R/. Hence a* c. Therefore a* e 
(a* + b — tip, a*) li), which says b e tip, a*) li. Q.E.D. 


References 

1. G. Debreu, “Theory of Value,’’ Wiley, New York, 1939. 

2. S. Eilenbero and D. Montgomery, Fixed point theorems for multivalued trans¬ 
formations, Amer. J. Math. 68 (1946), 214-222. 

3. C. Fouroeaud, B. Lenclud, and P. Sentis, Equilibre, optimum, et decentralisation 
dans un cas de rendement croissant, Cah. Seminairc d'£conomitrie CNRS 15 (1974), 
29-46. 

4. R. Guesnerie, Pareto-optimality in non-convex economies, Econometrica 43 (1975), 
1-29. 

5. I. W. Sandberg, A nonlinear input-output model of a multisectored economy, 
Econometrica 41 (1973), 1167-1182. 



JOURNAL OF ECONOMIC THEORY 13, 439-44? (1976) 


Differentia! Properties of Functions which are 
Solutions to Maximization or Minimization Problems 

Errol Glustoff* 

Department of Economics, The University of Tennessee, Knoxville, Tennessee 37916 
Received May 27, 1975; revised May 10, 1976 


Given the function /(x, y), consider the problem of finding a function 
g(x ) so that fix, g(x)) = 0 for all x in some domain. In this paper I wish 
to consider conditions under which the solution function g inherits the 
property of differentiability from /. The basic result along these lines is 
of course the classical Implicit Function Theorem, which is, assuming 
that all activity is based on some sort of optimizing behavior, an extremely 
useful tool of economic analysis. For, under the assumption that a regular 
maximum or minimum exists (cf. [4, pp. 357-365]), and the identification 
of f(x,y) — 0 as the first-order condition with respect to the vector y, 
this theorem implies the desired property, at least locally. As a specific 
example, the implicit function obtained from the first-order conditions 
for utility maximization is the vector of ordinary demand functions, and 
the Implicit Function Theorem may then be invoked to guarantee the 
local existence and continuity of its partial derivatives. A further class 
of applications consists of guaranteeing the differentiability of envelope 
curves, such as the long-run cost curve when derived from the short-run 
curves, or as I used in an alternative approach to the theory of derived 
demand (cf. [2]). 

Unfortunately, this theorem is not so powerful as one might like since 
it only applies “in the small," i.e., only within a 8-neighborhood of the 
initial critical point, and it may be quite difficult to determine the largest 
such S for which the relationships hold (cf. [3, p. 49]). Even then there is 
no guarantee that this range is sufficient; indeed, for many purposes, 
including in particular the properties of envelope curves, it is desirable 
if not necessary to know that the results hold over the entire domain of 
the variables. The limitations of this theorem thus impose limitations 
on the scope of many results of economic theory. It seems desirable 

* I wish to thank Professors G. S. Jordan and W. R. Wade and an anonymous referee 
for helpful comments on an earlier version. 1, however, retain full responsibility for 
any errors which might remain. 

439 

Copyright © 1976 by Academic Press, Inc. 

All rights of reproduction in any form reserved. 



440 


ERROL GLUSTOFF 


therefore to find conditions under which the Implicit Function Theorem 
holds globally, that is, under which the local implicit functions can be 
“pieced together” so as to preserve their properties. For then, even if the 
applicability of these conditions cannot be directly or easily verified, 
at least one knows the “costs” of assuming such applicability. This is the 
purpose of the present paper. 

We begin by introducing some notation. Throughout the letter x stands 
for a variable whose range is the subset X of Euclidean p-space, denoted R v , 
while y varies over Y Q R Q . The function / is defined on X x Y and has 
values in R“\f may of course be interpreted as the gradient vector of some 
real-valued function with respect to y. It is assumed throughout that the 
component functions of /, f l for i = 1,..., q, all have continuous partial 
derivatives on X x Y; symbolically, fe C X (X x Y). The Jacobian of / 
with respect to y, evaluated at (x°, y # ), will be designated by J v (x°, y°). Let 
A - {(*, y) e X x Y: f(x, y) = 0 and J v (x, y) f 0}; that is, together with 
the assumptions about/, A consists of precisely those points in the domain 
which satisfy the conditions of the familiar Implicit Function Theorem. 
Let B be an arbitrary subset of R p , Tl v the projection mapping on R p+V 
to R", and 77its inverse. Then, the set D is defined by D 
Il~ l [B n n,,(A)] n A; the projection of D on R p (resp. R") is denoted Z)„ 
(resp. D a ) where, of course, D v = //,(£>) — BnIl P (A). Thus. D is the 
subset of A consisting of only those points, corresponding to the previously 
chosen set B, for which we wish to use the relation f(x, y) •--* 0 to solve 
for y in terms of x. As will be seen below, it is easy to find examples for 
which the conditions to be placed on D do not apply to the entire set A\ 
hence, the need for restriction. Of course, it is assumed that Z) =£ <?•. 

It is further assumed throughout that D q is contained in the interior of Y. 
Finally, g is a function defined on some subset of R 1 ' with values in R Q . 
and g' is the /'th component of g, i = 1,.... q. 

We recall the statement of the Implicit Function Theorem, whose proof 
may be found in any advanced calculus textbook: 

Theorem 1. Let V C X x Y be a neighborhood of the point (x 11 , y°), 
where (x°, y°) e A. Then , there exists a unique function g e C l (lV), where W 
is a neighborhood of x°, such that g(x°) — y° and f(x, g(x)) — 0 for all 
xeW. 

Note. From the continuous differentiability of/, the continuity of g, 
and the continuity of the composition of continuous functions, it can also 
be guaranteed that J„{x, g(x)) 0 locally. Specifically, there exists a 

neighborhood W' of x° such that W Q W and Jfx, g(x)) ^ 0 for all 
xeW'. 



DIFFERENTIAL PROPERTIES 


441 


As the proof of the theorem reveals, it is only necessary that the 
neighborhood V be “two-sided” in y, i.e., that there exist a number 
8 > 0 such that y e {y e W: j y — y® | <8} implies that (x®, y) e V\ the 
coordinate x,° may be the end point of some interval. In this case the 
derivatives referred to, when evaluated at such a point, are of course to be 
regarded as left- or right-hand derivatives. To avoid complicating the 
notation any further, this situation will be explicitly ignored. 

The principal theorem of this paper is first proved for the case p = 1. 
This has the advantage of throwing the arguments into sharper relief. 
Since the proof for the general case uses the identical arguments, it seems 
to be a desirable way to proceed. In any event this case applies quite 
often in economic analysis (cf. the “cost curve” and “derived demand” 
citations above), and so is important in its own right. We shall also 
initially assume that A is connected; this assumption will be substantially 
weakened below (Corollary 4). 

Theorem 2. Let p = 1 and D l be connected. If D is compact, then there 
exists a function g defined on all of A such that f (x, g(x)) = 0 on D l and 

geCHA)- 

Proof. By the Heine-Borel Theorem, the compactness of D trivially 
implies that A is compact; since it is also assumed to be connected, 
A — [a, b ] for two real numbers such that — ao < a < b < <x>. Let 
x° — a and (x®, y°) e D. By Theorem 1 and the Note, there exist a unique 
function g and a number 8® > 0 with the stated properties. If [a, b] C 
[x®, x° + 8°), then there is nothing to prove, and so we assume otherwise 
and will show that g can be extended so that it has the desired properties 
on [a , 6]. Define the set A by A ^ {8 > 0: g has a C l extension on 
[x°, x° + 8° 4- 8) so that x e [x®, x° + 8° 4- 8) implies (x, g(x)) e D). Since 
this set contains 0 (Theorem 1), and is bounded above by b — a — 8°, 
it has a! supremum 8*. Define x* = x® 4- 8° + 8*, and, for the purpose 
of obtaining a contradiction, assume that x* < b. It must first be shown 
that, if 8* > 0, then g can be uniquely extended on [x°, x® 4- 8° + 8*). 
Suppose then that 0 < 8' < 8" < 8*, and that g' and g" are the property 
preserving extensions of g on [x°, x° 4- 8° -f 8') and [x°, x° 4- 8° -f S'), 
respectively. Obviously, g'(x° + 8°) — g”(x° + 8°), and so, if the set 
{x e (x° + 8°, x° + 8° + 8'): g‘(x) ¥= g'(x)} is not empty, containing say x", 
the continuity of the functions and the Infimum Principle imply the 
existence of a point x' e [x° + 8°, x") such that g'(x’) — g’(x') and 
g ' g" on (jc\ x"). This, however, contradicts the uniqueness assertion of 
Theorem 1 applied to (x\ g'(x’)), and hence g' = g' on [x®, x® 4- 8° + 8'), 
which implies that g is uniquely defined and has the properties of A on 
lx°, x*). 



442 


ERROL GLUSTOFF 


To show that x* < b results in a contradiction, let {x"} be a sequence 
from the open interval (x°, x*) such that x n -*■ x*. By definition, the 
sequence {{x”, g(x n ))} e D, and therefore, by the compactness of D, has a 
limit point (x*, y*) e D. Applying Theorem 1 to this point, there exist 
a unique function h and a number 17 > 0 such that h(x*) = y*, and 
( x , h{x)) e A and h e C\ the latter two conditions holding on (x* — 17, 
x* + 17). It is no loss of generality to suppose that 17 < 8° -f 8 *. To show 
that g se h on (x* — 17, x*), note that, for any (x, y) e A, the Inverse 
Function Theorem, applied to the function 0 (x, y) = (x, /(x, y)), 
guarantees that <t> is locally one-to-one. Therefore, if V is such a neigh¬ 
borhood of the point (x*, h(x*)), then, since a subsequence of {(x n , g(x”))} 
converges to (x*, h(x*)), we must have g ^ h for all x B e [x°, x*) “near” 
x*, i.e., in the /{'’-projection of V. As before, the uniqueness assertion of 
Theorem 1 then guarantees that g == h on (x* — 77, x*). But (x, h(x)) e A 
for x 6 (x* — 17, x* -f -q), and h e C l ((x* — 77, x* + 77)). Therefore, it 
follows that g can be uniquely extended by h so that (x, g(x)> e D and 
geC 1 on [x°, x® + 8° -f 8 * 4 - VX where rj' = min{iy, b — x*} > 0. 
Therefore, 8 * cannot be the supremum of A, and this contradiction 
establishes that x* > b. Since the existence of h and 77 with the stated 
properties is independent of the assumption x* < b, in fact (x, g(x)> e A 
and g e C 1 on [x°, x° -f 8° + 8* + rj) — [x°, b -f 17) 3 [a, 8] = D,. 

Q.E.D. 

The point x° and its counterpart (x°, } fi ) play no significant role in the 
proof of the theorem. By choosing any arbitrary point (x, y) e D, the 
above argument can still be applied (to both sides if a < x < b), thus 
obtaining the following stronger result: 

Corollary 1 . Under the hypotheses of Theorem 2, to any point in D, 
there corresponds a function g containing that point and satisfying the 
conclusions of Theorem 2. 

Using this corollary, it will now be shown that the compactness of D is 
equivalent to the statement that there is a positive integer M such that, 
for each x e D it there are precisely M distinct points (x, y*) e D; that is, 
for each x, the equations /(x, y) = 0 have M distinct solutions at which 
the Jacobian with respect to y does not vanish. Using the notation | S | 
to denote the number of elements of the set S, this result is 

Corollary 2. Let p — 1 and D 1 be connected. Then, D is compact if 
and only if D 1 is compact and there exists an integer M such that, for every 
x e Z),, | /Jf’Kx} n TlfD)) r\A\=M. 



DIFFERENTIAL PROPERTIES 


443 


Proof. We consider necessity first, i.e., that the latter two conditions 
imply the compactness of D. To show that D is bounded, note that 
DQD t x D ,; thus, since D l is compact by hypothesis, it suffices to show 
that D„ is bounded. This in turn will be demonstrated by showing that D, 
is a subset of a particular compact set, which we now proceed to construct. 
For all (x°, y°) e D, there exist a function #<*».»») and a number > 0 
such that £<*».,•) (*°) = y°, and also that /(x, £<*»./)(*)) = 0 , e C\ 
and J v (x, ^ 0 , where these last three properties hold on the 

set E = {x e D^: | x — x° | < Let 8*® = $ min{8(,. tV .<): i = 1 ,..., M) 

for all x e D x . Since y 04 ^ y° J for i ^ j, by choosing the 8 (lc o -v o<) small 
enough, it can be guaranteed that, for every x e (x° — 8*®, x° 4- 8*®), 
g<<ro,v»o(x) ^ g<*®.*«/)(x) for i ¥=,/; i.e., the functions are distinct on the 
neighborhood defined by 8 X ®. Clearly, D t Q U«d, (x — 8*, x + 8 X ). 

Therefore, by the compactness of D 1 , there exists {x 1 . x T } Q D x such that 

A C U<_i (x* - 8,, x* + 8,). Let g u = ?<*«.»</), i = l, -, r, j = 1,..., M, 
where of course (x 4 , y iJ ) e D, and let F, s= [x* — S, , x 4 + 8,] r» /) 1 . 
Since D x is compact, F, is compact. Hence, the continuity of implies 
that g t ,{Fi) is compact for all i and j, and therefore Q = Ul-i U«-i gi/Fi) 
is compact. It only remains to show that D a Q Q ; let y e D „. By definition, 
there exists an x e D x such that (x, y) e D. Therefore, for some t between 1 
and r, x e (x i — 8 { , x‘ + 8,) n D x C F, ; to simplify notation, let this 
i = l. Hence, by the choice of , /(x, g i; (x)) = 0 and J v (x, gu(x)) ^ 0 , 
j —■ 1 ,..., Af, and therefore, by hypothesis, there exists a j, say j = 1 , 
such that y — g u (x). Therefore, y e g 11 (F l ), which implies that D„ Q Q. 

To show that D is closed, let {(x n , y n )} £ D such that (x", y") -» (x°, y°). 
Therefore, x n -*■ x°, and so, by the closure of D x , x° e D x ■ Hence, there 
exists (y 01 ,..., y° M } £ Y such that (x°, y 0< ) e D. i = 1 ,..., M. Applying 
Theorem 1 to each of these points, we obtain the M distinct functions g t 
and a number 8 > 0 such that g ( is continuous on (x° — 8, x° + 8), 
g,(x°) = y 04 , and (x, g,(x)) e D for x e (x® — 8, x° -f- 8), i = M. Now 
since {x n } converges to x°, there exists an integer N such that 
x" e (x° — 8, x° + 8), for n > jV. Therefore, by hypothesis and the fact 
that (x B , y n ) e D for all n, for each n > N, there exists an i such that 
y” = gi(x"). Since the sequence {y n : n ^ N} is an infinite set, and since i 
varies over a finite set, it follows that, for some specific i, say i — 1, there 
exists a subsequence {y n *} of {y B } for which y n > = g^x"')- Now, y n< -*> y°, 
and, by the continuity of g 1 , gfx n ‘) -*■ y 01 . Therefore, since the limit of a 
convergent sequence is unique, y° = y 01 . Consequently, (x°, y°) e D, and 
so D is closed and hence compact. 

Conversely, note first that the compactness of D trivially implies the 
compactness of Z), . Second, for every (x, y) e D, there exists, by the 
Inverse Function Theorem applied again to &(x, y) = (x,/(x, y)), a 



ERROL GLUSTOFF 


S > 0 such that <t> is one-to-one in the open ball centered at (x, y) with 
radius 8 . The collection of such open balls is an open covering of D, and 
so there exists a finite number, say M, which cover D. Since <P is locally 
one-to-one, it follows that if ( xy') belongs to D and to the open ball with 
center (x, >■), and if (x', y") e D and y' y", then (x', y’) must belong to 
some other open ball in the finite subcovering. Therefore, for every x 6 £> l , 
the set {y: (x, y) e D) can contain at most M distinct members. 

Let M(x) be the number of distinct points y such that (x, y) e D. It 
follows that {M(x): x e D x } is a nonempty, bounded set of integers and so 
has a maximum which we assume to be M. Let x° e Dj such that 
A/(x°) = M. Now, by Corollary 1 of Theorem 2 , for each point 
(x°, y*) e D, i — 1 ,..., M, there exists a function g, such that g<(x°) = y w , 
g { e C\[a, &]), and (x, g,(x)) e D for all xeD, = [a, b]. Clearly, g i =£ g, 
in a neighborhood of x°; in order to show that this holds for all of D x , 
one need only appeal to the Inverse Function Theorem as used in the proof 
of Theorem 2 . Therefore, for any x s D,, (x, g ( (x)) e D,i — 1 ,..., M, and 
gi(x) =£ gi(x). Therefore, M(x) 5 ; M, and so M(x) = M for all x e D x . 

Q.E.D. 

That the set A is frequently “too big” to satisfy the hypothesis of 
compactness can be most easily seen by considering the unit circle 
f(x,y) — x 2 + j’ 2 — 1 — 0 . In this case n t (A) — (— 1 , 1 ), which is of 
course not closed. Therefore, by Corollary 2 or directly, A is not compact, 
but by taking B to be any closed interval contained in ( — 1 , 1 ), Theorem 2 
can be applied to this function. On the other hand the following picture 
shows that the compactness of D x is by itself insufficient. In this case, 
however, there is a weaker result, applicable only when p = 1, which 
guarantees the existence of a function g satisfying all of the conclusions of 
Theorem 2 with the exception that its derivatives exist and are continuous 
on the entire domain exclusive of a finite number of points. Its proof is a 
straightforward application of the notion of compactness arid istherefore 
omitted. Finally, the compactness of D can be slightly relaxed, as the 
following result shows. 



|(X,V>. 1 (X.Y) .011, (X.Y) * 0] 


• ✓' 
/ 

1 _ 










DIFFERENTIAL PROPERTIES 


445 


Corollary 3. Theorem 2 remains true of the assumption that D is 
compact is replaced by the requirements that D is closed and either 

(a) D a is bounded-, 

(b) g (as given by Theorem 1 ) is uniformly continuous wherever 
defined, or 

(c) the derivative of g is bounded on its domain. 

Proof. If D x has a bound, then we choose x° (see Theorem 2 ) to be that 
point; otherwise, x° is any arbitrary point of D t . We then proceed 
precisely as in the proof of Theorem 2 to define the nonempty set A. This 
set has no supremum only if it is unbounded, in which case there is 
nothing more to prove. If a supremum exists, then we need only show that 
{#(*")} has a convergent subsequence. Under (a), this is a trivial conse¬ 
quence of the Bolzano-Weierstrass Theorem; under (b), the uniform 
continuity of g and the convergence of {jc n } imply that (g(x n )} is a Cauchy, 
and hence, convergent, sequence. Finally, under (c), a corollary to the 
Mean Value Theorem implies that {g(x”)} is unbounded only if the 
derivative of g is unbounded. Therefore, under either hypothesis, (g(x")} 
has a convergent subsequence, and the remainder of the proof of 
Theorem 2 applies verbatim. Q.E.D. 

To this point it has been assumed that D x is connected. Suppose instead 
that £>, is the union of a finite number of nondegenerate, compact, 
disjoint intervals. Since the interest of this paper is in the differentiability 
of a particular function, it makes little sense to consider sets containing 
isolated points. Theorem 2 then asserts that each of the component 
intervals is the domain of a suitable function. Because of the disjointness 
of the intervals, there is no problem in then piecing together a function 
from these separate ones which is continuously differentiable everywhere. 
Moreover, as is obvious from Corollary 2 , the number of critical points 
must be the same for any interval, but can vary across intervals. Hence 

Corollary 4. Let p = 1. If = ULi fa* > b*], r < co, a k < b k , 
and if, for every k, there exists an integer M k such that, for every xe[a k , b k ], 

I n /7i(Z>)] n A I = M k , then the conclusions of Theorem 2 hold. 

Theorem 3. Let p be an arbitrary positive integer, and let D p = [a x , b,] 
X ••• x fa„, b„], where — oo < a, < b, < oo, / — 1 ,...,p. Then, if D is 
compact, there exists a function g defined on all of D r such that f(x, g(x)) = 0 
on D p and g e C^Dfi. 



446 


ERROL GLUSTOFF 


Proof. Let x° be the center of the p-dimensional compact interval D p , 
i.e., x® e (flj + ¥.bi — aO,..., a p + #b p - a„)), and, for 8 > 0 , H°(8) «s 
{x e R*: | x ( - x<° | < 8(b { — a t ), i = 1,..., p }. Again let (x°, y°)eD and 
use Theorem 1 to obtain the unique function g and number 8° > 0 so 
that the set A == {8 > 0: g has a C 1 extension such that x e //°( 8 ° 4 - 8*) 
implies (x, g(x)) e D } is not empty. (Of course, for want of something to 
prove, we assume that 8 ° < 1.) Since A is bounded by 1 — 8°, it has a 
supremum 8*. As before, if 8* > 0, then, working with 8, g can be shown 
to be uniquely extended from H°(8°) to H°(8° + 8*). Assume that 
8* < 1 — 8°, let x* be any point on the boundary of H°(8° -+• 8*), 
denoted /f°[ 8 *], and let {x* n } be a sequence in H°(8° -f 8 *) converging 
to it. Again, we obtain an ( x*, y*) e D which is a limit point of the sequence 
{(**", g(x* n ))}. Theorem 1 then yields the function h and set //*(ij), 
rj > 0, so that h(x*) = y*, and (x, h{x)) e A and he C l for all x e H*( rj). 
Just as in the proof of Theorem 2, it is then shown that g ^ h on 
H u (8° + 8 *) n Note that, if x' e and h' and H'(rj') are the 

corresponding function and set obtained from Theorem 1, then g h on 
H°(8° + 8 *) n H*(rj) and g = h’ on H°(8° 4- 8 *) n H'( 17 ') imply that 
h h' on H°(8° -f 8 *) n H*{rj) n H' (ij'). Hence, for any point x e //°[ 8 *], 
# can be unambiguously extended. Since //°[ 8 *] is compact and the 
collection {//“(tj,)} for all x e Jf°[8*] is an open covering, there is a finite 
subcovering {H*( tj<)}, Letting 17 * == min{r; f : / = 1 ,..., r}, 

tj* > 0, and so 8 * cannot be the supremum of A. This contradiction 
establishes the result. Q.E.D. 

The use of both the center of D„ and the shrunken version of D„ as the 
neighborhood in the proof of Theorem 3 makes the proof easier to picture 
but is not necessary. By merely redefining H°(8) to be the set 

{x e R*: max{a,, x t ° — 8 } < x, < min{ 6 <, x * 0 + 8 }, i = 1 

for any x° e D v , it can be seen that Corollary 1, and hence, Corollary 2, 
of Theorem 2 applies to the case where p is an arbitrary positive integer. 
An important special case of the latter, which incidentally has a trivial 
proof, is when M — 1. This occurs when, for example, there exists a 
unique, regular maximizing (resp. minimizing) vector y for each x. As is 
well known, strict quasi-concavity (resp. convexity) of the objective 
function is a sufficient condition for such uniqueness; we conclude by 
stating and proving a condition on the set D which is also sufficient. 

Corollary. Theorem 3 remains valid if the compactness of D is replaced 
by the condition that D be a convex set. 



DIFFERENTIAL PROPERTIES 


447 


Proof. By the extension to Corollary 2, it suffices to show that 
M(x) — 1 for x e D p , i.e., that (x, y 1 ) e D and ( x, y*)e D imply that 
y 1 = y 2 . Let F be the q x 1 column vector with term ( f*(x, y*) — /‘(x, y 1 )), 

G(x, y) s ((df*ldxi)(x, y) | (df { l8y k )(x, y)), / = 1,..., q, j = 1. p, 

and k — 1 . q, be a g x p + q matrix, 


z = 



-y?r 



q, 


be a p + q X 1 column vector, and e = (1,0.0) e R*. By a corollary 

to the Mean Value Theorem [1, Corollary 20.12], for any g-tuple of real 

numbers e, there exists a A e [0, 1] such that F ■ e — G(x, y 3 ) • z ■ e, 

where (x, y 3 ) = \(x, y 1 ) + (1 — A)(x, y*), and by the convexity of D, 
(x, y 3 ) e D. Since (x, y 1 ) and (x, y*) both belong to D, F is a null vector, 
and hence F ■ e — 0 Q Q , the q x q null matrix. Clearly, G(x, y) • z — 

G'(x, y) • z', where G'(x, y) == ((dffdy^ix, y)), / = 1. q, j = i,..., q, 

and z' ( y? — yf). Therefore, by the associativity of matrix multi¬ 
plication, we have 0 a „ — [G‘(x, y 3 ) • z'] ■ e = G'(x, y 3 ) ■ (z' • e). Now, 
det G'(x, y 3 ) e= J v (x, y 3 ) ^ 0, since (x, y 3 ) e D. Therefore, G' -1 (*> y 3 ) 

exists. Hence, we have 0 Si<r = G' _1 (x, y 3 ) ■ 0 c-5 = z' ■ e. But z' • e — 

(yf — y, 1 10„Therefore, by the definition of matrix equality , 3 — y 1 . 

Q.E.D. 


References 

1. R. G. Bartle, “The Elements of Real Analysis,” Wiley, New York, 1964. 

2. E. Gujstoff, A generalization of the neoclassical theory of derived demand. The 
University of Tennessee Center for Business and Economic Research, Knoxville, 
Tennessee, Working Paper No. 12. 

3. G. Hadley, "Nonlinear and Dynamic Programming,” Addison-Wesley, Reading, 
Mass., 1964. 

4. P. A. Samuelson, “Foundations of Economic Analysis,” Harvard Univ. Press, 
Cambridge, Mass., 1947. 




JOURNAL OF ECONOMIC THBORY 13, 448-457 (1976) 


Notes, Comments, and Letters to the Editor 

Age-Income Profiles, Income Distribution and 
Transition Proportions 


This paper presents a model to link age-income profiles, income distribution, 
and transition proportions. Transition proportions play a centra] role in the 
Markov-chain approach to income distribution. This stochastic model is much 
criticized, but it is shown in the paper that its most attractive characteristic can 
be maintained, while at the same time integrating it with a micro-economic 
foundation of age-income profiles. These profiles are inferred from capability 
development and individual choice. The model also analytically generates an 
income distribution. 


1. Introduction 

This paper presents a model to link age-income profiles, personal income 
distribution, and transition proportions. An age-income profile specifies 
the relation between an individual’s age and his income; transition 
proportions play a central role in the Markov-chain approach to income 
distribution. Income is understood to be labor income. The model is a first 
attempt to provide an integrated treatment of these elements and therefore 
uses rather simple specifications; the main purpose is to show how the 
link can be made. 

Basically, age-income profiles can be explained in two steps. The first 
step identifies the variables that are relevant to the explanation of 
individual income and the second indicates how these variables relate to 
an individual’s age. Reflection on the first step has led many authors to 
point to the role of an individual’s attributes or capabilities {11-13, 17, 20]. 
To analyze the role of capabilities for individual labor incomes, jobs may 
be viewed as demanding certain degrees of various capabilities; individuals 
are endowed with specified degrees of these capabilities and in the labor 
market these supply and demand conditions are confronted. A price 
per unit of capability results, and thus individual income is derived from 
capability endowment and price per unit [13, 16, 19-21], 

Given this way of explaining individual incomes, the next step is to 
specify the development of individual capabilities with age. Lydall [11] 
f $raws on psychological studies about intelligence and also points to the 

448 

Copyright © 1976 by Acajfcpiic Press, Inc. 

All rights of reproduction in any form reserved. 



AGE-INCOME PROFILES 


449 


effect of accumulation of experience. In his presentation, the development 
of capabilities is an autonomous process. In contrast to this view is the 
human capital approach [14]. There, the development of capabilities is 
interpreted as an endogenous process: Individuals deliberately undertake 
to improve their skills. Insofar as this improvement has a cost, either as 
direct outlay or as forgone income, the training can be viewed as an 
investment, which is only undertaken at the prospect of a sufficient rate 
of return. This approach thus immediately links income changes and 
development of skills employing supply side considerations. Development 
of skills, however, is not necessarily the result of deliberate training 
programs. It may also result as joint production, as a by-product of 
performing a specified task [5]. The literature on learning curves [1, 2,10] 
also points in this direction. The view on age-income profiles taken in this 
paper ties in with the explanation of wages through capabilities and adds 
dynamic features to it. An extended review of the related literature is given 
by Hartog [7]. 

Attempts to explain the personal distribution of income have sometimes 
employed stochastic models. The approach taken by Champemowne [2] 
is an application of the socalled Markov-model. This model states that 
there exist probability distributions for individual incomes in this period, 
given income in the previous period. Income is measured in intervals and 
the model consists of a set of transition probabilities, each indicating the 
probability that an income in interval i in this period will be in interval j 
in the next. Given these transition probabilities, the income distribution 
will, under certain conditions, move towards an equilibrium distribution 
independent of any original distribution. This model was estimated for the 
Netherlands by Hartog [6] and Mustert [15]. 

The stochastic nature of such a model is subject to much criticism: “The 
model tyas barely explanatory content, at best it has descriptive value.” 
However, the most interesting aspect of the model is the emphasis on 
dynamic processes within the income distribution, i.e., on the combination 
of changes in individual incomes and overall stability of the income 
distribution. This property of the model can be exploited without recourse 
to the stochastic foundations. It is then necessary to phrase the model in 
terms of proportions rather than probabilities. 

Section 2 presents the basic model in which this is accomplished; the 
model generates an income distribution as well. Section 3 provides a 
specification of the relations involved in the basic model and yields explicit 
solutions. Finally, Section 4 concludes the paper with some evaluating 
remarks. 



450 


JOOP HASLTOG 


2. The Basic Model 

Let there exist a relation between an individual’s age a and his income y: 

y = g(a) dg/da > 0 y ^ 0, (1) 

and let this relation have an inverse: 

a = a(y). (2) 

Let the age density function be given by f(a). If the age-income profile 
holds for all individuals, the distribution of individuals by income follows 
as a transformation of the age distribution through the age-income profile 
(see, e.g., [4, p. 9]): 


<P( >0 = f{a{ y)} I dajdy |. (3) 

The parameters of the Markov-model may be derived in the following way- 
Let individual incomes be assigned to intervals, numbered i — 1, 2,..., h . 
h indicating the highest income interval. Interval i is determined by its 
upper boundary k t and the distance between upper and lower boundary, 

9i ~ k, — k ( _ x . (4) 

An income y is assigned to interval i if k t -l ^y <kj. It is assumed that 
classification of incomes takes place once a year. At that moment, income 
is measured as a flow per year, as indicated by the age-income profile. 
It is further assumed that the individuals whose incomes are classified all 
move along the same profile and that the age density function is stable. The 
format that was chosen for the age-income profile implies that individuals 
either will move up to a higher income interval or stay where they are. 
Consider first upward mobility. 

Upward mobility results from two counteracting forces. Individual 
income increases from year to year, and this implies that each year a 
number of individuals will move from their present income interval to a 
higher one. However, there will also be “mortality”: individuals will, 
for a number of reasons, withdraw from the labor force. This effect will 
be studied later. Ignoring mortality for the moment, an individual will 
leave interval i within a year if the increase in income during the year is 
sufficient to boost income above . Since the age-income profile has a 
positive slope, it is possible to define a variable b t as the increase in income 
during one year such that at the end of the year income equals k,. 



AGE-INCOME PROFILES 


451 


All individuals with income y > k t — b t in class i have income sufficient 
to attain income class i + 1 next year. It will be assumed that k ( — b, > 
ki_ t . The proportion of individuals in income interval*' whose income level 
is sufficiently high to reach interval i 4- 1 next year, r * { , is then 

r*i=\ ®(y)dy/j &(y)dy. (5) 

J k i -h i I J k i _ l 

However, since there is a one-to-one link between income and age, 
(5) may also be expressed in terms of age. That is, instead of finding out 
how many people have the proper income level within an interval to grow 
beyond the upper boundary of that interval, one may try to find out how 
many people have the proper age, such that their corresponding income is 
sufficient to grow beyond the interval in one year. Denote by a( y) the age 
at which income level y is reached. Then (5) may be rewritten: 

.a(k,) . .o(t,) 

r* = | /(a) dal f f(a) da. ( 6 ) 

Equation (6) can be simplified if the definition of b t is recalled: It is the 
income increase during a year corresponding to end-of-year income level 
k, . But obviously, the age difference needed for an annual income 
difference b, is just one year. Therefore a(k, — b,) must equal a(kd — 1 
and ( 6 ) may be rewritten: 

,aik t ) 

r* = f m dal J f(a) da. (7) 

Next, consider mortality rates (i.e., rates of departure from the labor 
force). Mortality will be assumed constant and independent of age, at a rate 
of S per unit of time. This corresponds to a probability of surviving to age 
a, p(a) of 

p(a ) = e So 8 > 0, (8) 

and a rate of departure within a year of 1 — e~ 9 . Since the age density 
function has already been assumed to be stable, it is restricted to /(a) = 
f}e~ Da p(a), where is the birth rate and p is the rate of growth of the labor 
force (see, for example, [16]). Substituting for p(a) from ( 8 ), the age density 
function may be written as 

f(a) = fc-so p > o, (9) 

where the following relation holds: fi = p + 8. 


6 4 */ 13 / 3-8 



452 


JOOP HARTOG 


Therefore, the labor force is stable (the age structure is constant) but 
not stationary: It grows at a rate /> = /3 — 8 . 

From the constant mortality rate, the rate of departure from income 
class i within a year, r ui , follows immediately 

r»< = 1 — e~ a . ( 10 ) 


Upward mobility then equals 

r*t = (1 — r ui ) r* ( = . (11) 

The final variable to be defined is the proportion of those who will stay 
in income class » within a year, r bi , which is simply the remainder from 
upward mobility and departure: 

r"b* — 1 ~~ r u( r P i . ( 12 ) 


3. A Specification 


3.1. Age-income Profile 

The specification of the age-income profile that will be used here is a 
simplification of a more general theory, in which both labor supply and 
labor demand are analyzed as capability bundles (see [ 8 , 9]). For present 
purposes, the model may be reduced to include only one capability. 

To analyze labor supply, individuals are distinguished by labor force age, 
measured continuously as duration of labor force attachment. At every 
age, the individual can supply a certain amount of labor. Labor is measured 
in experience units. Thus, a more experienced worker commands a greater 
stock of standardized units of labor. The size of his stock of labor at age a 
is indicated by jc; it measures the maximum number of units of labor the 
individual can supply. As he grows older, he increases this capacity. This 
can be expressed as follows. 

x = x m (l — e~ va ) y > 0. (13) 

Equation (13) indicates that the process of accumulating experience 
converges to a limit x m ; the rate of convergence is indicated by y. Some 
support for this way of representing the accumulation of experience can 
be found in the literature on learning curves [1, 3, 10]. The learning curve 
is based on the observation that the time needed to perform a specified 
task depends on the number of times this task has been performed before; 



AGE-INCOME PROFILES 


453 


this relation is usually taken in log-linear form. (13) is a representation of 
the inverse of the learning curve. The parameters jc m and y can be varied 
to accommodate the way experience is accumulated under different circum¬ 
stances (e.g., in different jobs; more complicated jobs may show a longer 
period of substantial growth of productivity with experience). 

The individual decides how much of his capacity he will supply. That is, 
he offers a proportion s (0 < s < 1) of his stock of labor. His decision 
follows from choosing an optimum combination of consumption (income) 
and effort over the remainder of his working life. It will be assumed that 
the choice of effort is made only once and his seeking an optimum effort- 
consumption combination over his entire working life then results in a 
relation between effort and the wage rate per unit of capability, w (see [8]): 

s = x(w). (14) 

Now the individual labor supply function can be derived. Individual 
supply of labor is defined as L -- sx. Substitution of (13) and (14) and 
integrating over age yields aggregate supply of labor (in experience units) 
as a function of the wage rate. 

A similar analysis can be made of the demand side of the labor market. 

Assuming the basic labor input in the production function is an 
experience unit, a demand relation may be derived from the hypothesis of 
profit maximization (see [9]). Confronting supply and demand then yields 
an equilibrium wage rate per unit of experience w* . The age income 
profile then follows as 


y -■= .Vn.d - e ~ ya ). 


(15) 


where 


}'m — w*r(w*) x m . 

According to (15) income converges to a maximum y m . This maximum is 
determined by the value of the maximum attainable experience level x m . 
The rate at which income converges towards its maximum equals that of 
experience. 

3.2. Income Distribution 

To find the income distribution, recall the age-income profile (15) and 
its inverse % 


a = —(1/y) ln(l - y/y m ). 


06) 



454 


JOOP hartog 


Differentiating (14) and substitution of the result and (16) into the age 
density function as prescribed by Eq. (3) yields 

<*(y) “ (PM y~ n B '\y m - yf' v) ~ l . ( 17 ) 

The slope of #(y) depends on Ply: 

- - jy* /v (y - i) Om - y) ,BM -* § 0 as /? g y. (18) 
3.3. Transition Proportions 

To solve for the transition proportions, first solve (7), using (9), (16), 
and the fact that exp(a In z) = z“ to obtain 


r 


* 

Jit 


= (e B - 



y m 1 

} m ^ i 




(19) 


By applying the formulas of Section 2, where the basic results were defined, 
it is straightforward to derive the full solutions. For convenience, all results 
of this section are recollected. Let all individual incomes be determined 
from the age-income profile 


y = -Ml - e- vo ). (15) 

Let the labor force have a stable age structure 

/(«) = /3e- So , p = p + S. (9) 

Then, the income distribution is given by 

<*>( v) - (Ply) y~J ,y (y m - ,v) <fl/v> -\ (l 7) 

and the transition proportions are given by 

( 20 ) 

v-«i(M^r-.r. (2 d 

r bi = 1 - r,„- — r. (22) 


In this model, the outcome is primarily determined by the parameters p 
and y. p is the birth rate of the population and, at the same time, the rate 
of decay of the age density with increasing age. y is the rate of convergence 
of individual income towards its maximum level y m . The shape of the 
income distribution is fully determined by j8 and y. If p equals y, the income 
density is constant; the effect of the increase in income with age is perfectly 



AGE—INCOME PROFILES 


455 


matched by the decrease in age density. If J3 > y, the slope of the income 
density function with respect to income is ever negative; if /3 < y, the 
slope is ever positive, fi — y may also serve as a benchmark to interpret 
the results for upward mobility; r vi then reduces to 


r vi = - l)((y m - *<)/g,). 

Upward mobility appears proportional to the distance of the upper 
bound of the income interval to maximum income, measured in terms of 
income interval length. If q t = q , all i, r Ti will decrease for increasing values 
of i. This stems from the nature of the age-income profile. Conversely, 
if it is desired to keep r pi constant for all i, q t should be proportional to 
y m — k { . If j8 > y (a rate of decay of age density surpassing the rate of 
income convergence towards its maximum), there is a depressing effect on 
r„, relative to the case where jS = y and, conversely, if j8 < y, there is a 
positive effect. 

The population parameters 8 and j8 have an identical effect on all r„, 
(apart from the impact of |3/y). Compared to the case with /3 — 5 (i.e., 
p — 1 3 — 8=0), r vi is increased if /3 > 8. Hence, in a growing population 
with stable age structure, upward mobility is greater than in a stationary 
population. 


4. Concluding Remarks 

The previous chapters have demonstrated how age-income profiles, 
income distribution, and the Markov-chain approach to this distribution 
may be related to one another. This was accomplished first in a general 
way, while later specific formats for the relations were chosen. Age-income 
profiles, were derived from the development of individual capacity with 
age and from a market determined price per unit of capability. Given the 
age-income profile, the distribution of personal income is derived by 
assuming a system consisting of individuals at different points in their 
career, i.e., at different points along the same age-income profile. The 
set of individuals in the system is described by an age density function. 
Then, the income distribution results from transforming the age distri¬ 
bution through the relation between age and income according 
age-income profile. The relevance of the age distribution for the s 4&i&^j/' 
bution of income has been recognized before, of course, but an explicit 
analytical framework like the one presented here has not been developed 
earlier. And such a framework is required to investigate particular 
hypotheses. 

In this model it is also possible to derive how individuals move through 



456 


JOOP HA.RTOO 


the income distribution. The level and rate of change of individual income 
is known at every age and since the distribution of individuals by age is 
also known, the frequency of individuals crossing income interval 
boundaries follows. This implies that it is possible to derive the parameters 
of a model similar to the Markov-chain model of income distribution 
(the so-called transition proportions). Strictly speaking, it is no longer 
justified to call this result a Markov-chain (which is based on stochastic 
foundations) since the constant proportions of individuals that move 
between income intervals result from a process in deterministic equilibrium. 
However, interpreting a transition matrix (the array of transition 
proportions) along these lines opens the way to link this matrix to its 
determining forces and allows a theoretical underpinning of the character 
of the matrix. It should be pointed out that there are differences in the 
interpretation of the transition matrix. In the Markov-chain interpretation, 
the matrix is assumed given and implies conclusions about an equilibrium 
income distribution and the time-path towards this distribution. In the 
present interpretation, the transition matrix is a way of describing a process 
in equilibrium; the matrix will remain constant only as long as the under¬ 
lying set of profiles does not change. But if this set does change, the 
change in the transition matrix is predictable. Even extension of the 
approach to allow nonstationary transition matrices is possible. 

In its present specification, the model is rather simple. In particular, the 
transition matrix exhibits a very restricted structure; those who stay in the 
labor force can either move up one interval or stay where they are. Now it 
is an empirical fact that most year-to-year income changes fall within a 
limited range and similar restrictions have been made elsewhere (see 
[2] or [15]). But the model can easily be expanded to incorporate more 
general specifications. Income decreases can be included by modifying the 
age-income profile to show falling income at higher ages. Derivation of 
the formulas then requires the merger of two analyses, one for ag5s before 
maximum income is reached, and one for the ages beyond. The model 
can also be made more realistic by distinguishing a number of income 
profiles, corresponding to, e.g., homogeneous occupational groups. 

Clearly, many interesting modifications and extensions are possible. 
This paper has only demonstrated the basic framework in a simple 
specification. 


References 

1. G. BOhmer, Lerneffekte als Kosteneinftussgrossen und ihre BerQcksichtigung in der 
Kostenplanung unter Kostenrechnung, Ph. D. Dissertation, Westffllische Wilhelms- 
, University, Miinster, 1970. 



AGE-INCOME PROFILES 


457 


2. D. G. Champernowne, A model of income distribution, Econ. J. 63 (1951), 318-351. 

3. E. N. Corlett and V. J. Morcombe, Straightening out learning curves. Personnel 
Management 2 (1970), 14-19. 

4. P. J. Dhrymes, “Econometrics," Harper & Row, New York, 1970. 

5. R. G. Eckaus, Investment in human capital: a comment, J. Pollt. Econ. 71 (1963), 
501-505. 

6. J. Hartog, Een vergelijking van inkomensmobiliteit naar beroepsgroepen, “Pre- 
advies van de Vereniging voor de Staathuishoudkunde” (“Comparing income 
mobility by occupation"), 1973. 

7. J. Hartog, Ability and age-income profiles, Rev. Income and Wealth, in press. 

8. J. Hartog, Age-income profiles and capability development. Discussion Paper 
7402 G, Institute for Economic Research, Erasmus University Rotterdam, 1974. 

9. J. Hartog, Labor demand as demand for capabilities, Discussion Paper 7505 G, 
Institute for Economic Research, Erasmus University Rotterdam, 1975. 

10. D. King, “Training within the Organization," Tavistock Publications, 1964. 

11. H. Lydall, “The Structure of Earnings,” Oxford Press, 1968. 

12. B. Mandelbrot, The Pareto Levy law and the distribution of income, Int. Econ. 
Rev. (1960), 79-106. 

13. B. Mandelbrot, Paretion distribution and income maximization, Quart. J. Econ. 
76 (1962), 57-85. 

14. J. Mincer, Distribution of labor incomes, J. Econ. Literature 8 (1970), 1-26. 

15. G. R. Mustert. The development of the income distribution in the Netherlands 
after the second world war, a markovian approach. Research Memorandum, 
EIT 47, Tilburg Institute of Economics, 1974. 

16. J. D. PtTCHFORD, “Population in Economic Growth,” North-Holland, Amsterdam 
1974. 

17. A. D. Roy, Some thoughts on the distribution of earnings, Oxford Economic 
Papers, NS, 3 (1951), 135-146. 

18. R. S. G. Rutherford, Income distributions: a new model, Econometrica 23 (1955), 
277-294. 

19. W. H. Somermeyer, Vooroorlogse loonverschillen in de bouwvakken: een poging 
tot kwantitatieve verklaring, “Weekblad Bouw," pp, 381-384, 1947. (“Prewar wage 
differentials in construction: an attempt at quantitative explanation.”) 

20. J. Tinbergen, On the theory of income distribution, WeltwlrtschaftUches Arch. 77 
(1956), 155-175. 

21. F. Welch, Linear synthesis of skill distributions, J. Human Resources (Summer 
1969). 


Received: October 17, 1974 


loop Hartog* 
Institute for Economic Research 
Erasmus University Rotterdam 
P.O. Box 1738, Rotterdam, Netherlands 


* I am indebted to Wouter J. Keller and two anonymous referees for valuable sug¬ 
gestions. An earlier draft of this paper was presented at the European Meeting of the 
Econometric Society, held in Grenoble, France, September 1974. 



JOURNAL OF ECONOMIC THEORY 13, 458-471 (1976) 


A Note on the Core of Economies with Atoms or Syndicates 


1. Introduction 

During a lecture given in July 3972 at the University of Marseille 
Luminy, Professor Aumann commented upon the results obtained by 
Shitovitz [4]. In an exchange economy with a measure space of economic 
agents, Shitovitz studies the possibilities which the oligopolists, that is 
the atoms of the measure space of economic agents, have to “exploit” the 
other agents. In particular Shitovitz exhibits conditions which, if fulfilled 
by the set of atoms, guarantee that every core allocation is competitive 
from the point of view of the set of agents which are not atoms. Professor 
Aumann formulated two conjectures which extend and generalize these 
results. This paper gives a counterexample to each of these conjectures. 

Section 2 is devoted to the definitions and assumptions which are the 
same as Shitovitz’s [4], The counterexample to the first conjecture appears 
in Section 3. This conjecture concerns the structure of the set of atoms. 
The other conjecture raises the question whether the results proved by 
Shitovitz for atoms remain true if the atoms are replaced by syndicates. 
A counterexample to this conjecture is given in Section 4. Section 5 
analyzes the difference between a syndicate and an atom. 

In order to get his results, Shitovitz used an assumption concerning the 
atoms—the existence of a complete preference relation—which, at first 
sight, looked innocuous if not unnecessary. Indeed it is an important 
assumption for this kind of problem. A syndicate does not have in general 
a complete preference relation. And this is the reason why results which 
hold for atoms are no longer true with syndicates. 


2. Definitions and Notations 

We are interested in an exchange economy E with £ commodities. Let 
A be the set of all the agents of E and (Jl the cr-algebra of possible coalitions. 
A positive finite measure p on the measure space (A, OC) represents the 
respective “weights” of agents or groups of agents in economy E. The 
set A can J^e divided into an atomless part A 0 and a countable union of 
atoms Aj. 

458 

Copyright <C 1976 by Academic Pree*, Inc* 

All rights of reproduction in «rty form reserved. 



THE CORE OF ECONOMIES 


459 


For each agent (or consumer) a e A there are given: 

(a) a preference relation ><, on R/ the nonnegative orthant of R'; 

(b) a vector of initial resources co(a) e R + *. The function a>: A -*■ Rf 
is assumed to be /x-integrable. 

An allocation is a p-integrable function 5c: A -*■ R + ' such that 

5c dp — \ u> dp. 

5 a 5 a 

Definition 1. A coalition SsOl can improve upon the allocation 3c 
if and only if 

MS) > o, (i) 

and 

y(a) > a x(a) a.e. in S, (ii) 

and 

I y dp = I cj dp. 

J s J s 

The set of allocations that cannot be improved upon by any coalition 
is the core of the economy E. 

A price vector is a vector p e R/, p ^ 0. 


Definition 2. An allocation 3c is p-efficient if 

{x e R/, x >„ 3c(a)} =>{/»• x > p • 3c(o)} a.e. in A. 

Definitions 1 and 2 imply that, if 5c is a p-efficient allocation and if S e 01 
is a coalition such that 


p ■ w dp < I p ■ 5c dp 
5 s 5 s 

then the coalition S cannot improve upon the allocation 3c. 

Definition 3. An allocation 3c is competitive if there exists a price 
vector p such that 

x is a p-efficient allocation 
and 

p • 5c(a) = p ■ w(c) a-C- in A. 



460 


CHAMPSAUR AND LAROQUE 


According to the definitions given by Shitovitz [4], two agents a, and a a 
are said to be of the same type if 


o>(a i) = o>(fl a ), 

and 

V(jf, y) e R+ ( x R/, 

{x > a , o {* ><., .v}. 

Two agents are said to be of the same kind if they are of the same type and 
have the same measure. 

Two coalitions 5, and S 2 are said to be of the same type if there exists 
a measurable one-to-one mapping 0 from Sj onto S 2 and a real number 
a > 0 such that 

a and 0(a) are of the same type a.e. in S 2 , 
and 

V7 - e 01, TCS Z , 

(i(T) - *p(0-\T)). 

Two coalitions S x and S 2 are said to be of the same kind if they are of 
the same type and if 

p(Si) — p(S 2 ). 

A syndicate structure is a set of disjoint coalitions, called syndicates. 
A coalition Bed is said to be compatible with a syndicate S if either 


or 


BnS = 5 


Bn S = 0 . 


Given a syndicate structure, let be the a-algebra of coalitions which are 
compatible with all the syndicates in the structure. Of course 

die a. 

The set of allocations that cannot be improved upon by any coalitions in 
di is the di-core. Let A a be the union of all the atoms and syndicates and 
A e be the complement of A a in A. Of course > *■ 

C Aq and A j 



THE CORE OF ECONOMIES 


461 


and, if there is no syndicate: 

Aq ~~ Aq and A^ —— A a . 

Note that the syndicates considered by Dreze and Gabszewicz [3] are 
composed of identical agents and that these authors limit themselves to 
the study of ^-measurable allocations (equal treatment within a syndicate). 
In a sense, therefore, these syndicates could be considered as atoms. 

The following assumptions will be used: 

HI. 

f to dp, > 0. 

H2 (monotonicity of the preferences). 

V(x, y) e R/ x R/ 

{x y and x ^ y} =- (x >„ y), a.e. in A. 

H3 (continuity of the preferences). 

Vy e R/, the sets 

{x e R/ 1 x >„ y} and {x e R/ | y > e x) 
are open in (R/. 

H4 (measurability of the preferences). 

If x and y are ^-integrable functions from A into R/, then the set 

{a e A [ x(a) >„ y(a)} 

is measurable (i.e., it belongs to 01). 

H5 (convexity assumptions on the atoms). 

For each atom ae A x , the preference relation >„ is convex; i.e., 

Vy e R/, the set (x e R/ | x y) is convex. 

H6. For each atom aeA t , the preference relation > a is derived 
from a “preference or indifference” relation > a on R/, which is assumed 
to be a complete preorder. 

With the help of the preceding assumptions, Shitovitz [4] proved the 
following results. 



462 


CHAMPSAUR AND LAROQUE 


Theorem A. If x is an allocation belonging to the 31-core, there exists 
a price vector p such that : 

(i) x is a p-efficient allocation, and 

(ii) p ■ <o(a) > p ■ x(a), a.e. in A e . 

A proof of Theorem A with an argument completely different from the 
one used by Shitovitz can be found in [2]. 

In terms of value of their consumption bundle, the agents who are 
neither atoms nor members of a syndicate lose, or at least do not gain, 
at every A?-core allocation that is not competitive. That does not mean that 
they do not gain utility, even in monopolistic economies with one atom 
only. For a discussion of this problem see Aumann [1] and Shitovitz [4], 

Theorem B. If there exist at least two atoms and if all the atoms are 
of the same type, then the core is equal to the set of competitive allocations. 

Theorem C. If the atomic part A, can be divided into at least two 
disjoint coalitions of the same kind, then for each allocation x belonging 
to the core, there exists a price vector p such that: 

(i) x is a p-efficient allocation ; 

(ii) p • w (a) = p • x(o), a.e. in A 0 ; 

(iii) if ai and a 2 are two atoms of the same kind , then p • .v(oj) — 
P ■ x(a a ). 

Note that under the conditions of Theorem C, any core allocation is 
competitive from the point of view of the members of the nonatomic part 
in the sense that the value of their consumption at efficiency prices p is 
equal to the value of their initial resources (in the terminology'of Dreze 
and Gabszewicz [3], this is a “competitive restricted allocation”). That 
does not mean that the utility levels of the members of the nonatomic 
part are the same as the utility levels corresponding to a competitive 
allocation. In other words, the price vector p is not necessarily the price 
system of a competitive equilibrium. 

Conjecture 1. If the atomic part A 1 can be divided into at least two 
disjoint coalitions of the same type, then for each allocation x belonging to 
the core there exists a price vector p such that: 

(i) x is a p-efficient allocation, 

'“' , r ' 

(ii) p • to(a) — p • x(a) a.e. in A 0 . 



THE CORE OF ECONOMIES 


463 


This conjecture which generalizes Theorems B and C is true in the case 
where the preferences of all the atoms can be represented by the same 
homogeneous utility function [5, Theorem 2]. 

A counterexample to this conjecture is given in Section 3. 

Conjecture 2. If there are at least two syndicates , if they all are of 
the same kind and if all the atoms belong to the syndicates , then for each 
allocation x belonging to the 36-core there exists a price vector p such that: 

(i) x is a p-efficient allocation , 

(ii) p ■ oj(a) = p ■ x(a), a.e. in A B . 

A counterexample to this conjecture is given in Section 4. 


3. A Counterexample to Conjecture 1 

We consider an exchange economy E with three goods; C — 3. There 
are two atoms b x and b 2 of the same type with preferences represented by 
the utility function 

u b (x) = min{(x 2 + 2xJN), (2x a +■ xJN)} -f (xJN), 
with initial resources 


= «a(hj) = (3,0,0), 


and with measures 


M*i) = 2/3, p(b t ) = 1/3, 

where x h is the consumption of good and h and N a finite integer greater 
than 4. The counter example works with N — -f oo. Finite values of N 
are considered in order to get the monotonicity assumption H2. 

There are two other atoms and c 8 of the same type with utility function 

u e (x) = min/Ot, + 2x 3 jN), (2x, + xJN)} + (xJN), 

initial resources 


eo(c,) = co(c s ) = (0, 3, 0), 


and measures 


Me,) = 2/3, p(c t ) = 1/3. 



464 


CHAMPSAUR AND LAROQUE 


The nonatomic part A 0 is composed of agents a, ail of whom have utility 
function 

= (*1 + i/mx t + 1 /NX* + 1 IN), 

and initial resources 

co(a) - (0, 0, 3). 

Moreover, we take 

(j.(A 0 ) = 1. 

It is easy to verify that all the assumptions HI to H6 are fulfilled by the 
economy E. 

The atomic part A x can be divided into two coalitions of the same type, 
b x u c x and 
Consider the following allocation x e : 

x,(b x ) —- (0, 2 + 2e, I + e), 
x,(b 2 ) = (0, 2 — €, 1 - e/2), 

X'(c x ) = (2,0, 1), 

x,(c 2 ) ^ (2 + 3e, 0, 1 T- 3e/2), 

Va e A n , 3c«(a) = (1 — e, 1 — e, 1 — e), 

where e is a real number with 0 < e < 1/4. Let p be the vector price 
defined by 

P = (1,1,1). 

It is easy to verify that 3c f is p-efficient. Since the utility function of every 
agent a e A 0 is differentiable at point x,(a), there exists no price vector 
p’ nonproportional to p, such that x, is ^'-efficient. We have 

Va e A 0 , p • x f (a) — p ■ u>{a) = — 3e < 0. 

Therefore in order to show a contradiction with Conjecture 1 we have to 
prove that there exist values for .V and e such that the allocation x e belongs 
to the core of the economy E. 

For values of N, it is clear that any coalition that does not have 
ip it(£. initial resources positive quantities of all goods cannot improve 
ttpOffthe allocation x, . Therefore any coalition that can improve upon x. 



THE CORE OF ECONOMIES 


465 


contains necessarily at least two atoms of different types and part of A 0 . 
Furthermore if a coalition S can improve upon x, , we have necessarily: 

| p ■ x dp < \ p ■ <u dp. 

d S J S 

Therefore any such coalition must have one among the five following 
compositions: 

b x u u o tA 0 , 

CJ Cy xAq , 

ijUCjU <xA 9 , 
ij u u cj u i*/4 0 , 

by ^ Cj U Cy U qlA 9 , 

where a.A 0 is a coalition of measure a, 0 < a 1. with a A 0 C A 0 . 

Consider for example a coalition S with composition 

by U Cy U (XAq , 


Its initial resources are equal to 

J <u dp — ( 1,1, 3a). 

If S can improve upon x,, we have necessarily: 

j p • x dp <' j p ■ w dp 

which implies that a > 

By definition, in order to improve upon x t , S must be able with its 
initial resources to obtain utility levels equal at least to 

u*(x £ tf*)) = (1 + 1/JV)(2 - e) for atom by , 

u c (x t (cy)) = (1/1 + NX 2 + 3c) for atom c 2 , 

u*(x,{a)) — (1 — e + 1 IN)* for a e acA 0 . 

Consider the set Y of /x-integrable functions y : A -*■ R/ such that: 

j^y dp = dp. 


uHviby)) > iY(x f (by)) = (1 + ( 1 / N))(2 ~ c), 



466 


CHAMPSAUR AND LAROQUE 


and 

u e (y(Ct)) > ^(xjc,,)) = (1 + (l/N))(2 + 3e). 

With every y e Y we associate the utility «®( y) of the average consumption 
of the members of coalition aA n : 

u a (y) = u a (l/n(<xA 0 ) j ^y d^. 

If y e V maximizes u a ( y) on the set Y then, for large values of N: 


y(b 2 ) = X'(h 2 ) = (0, 2 - e, 1 - e/2), 
y(c 2 ) = 5c t (c,) = (2 + 3e, 0, 1 + 3e/2), 


1/mMo) f ydp 

xAq 

= 1/a y ai dfi — J y 


l/« [1 - 0/3(2 + 3e)), 1 - (1/3(2 - e)), 3c - (1/3(2 + e))]. 


Therefore: 


“-w - (nr + + '/")(■ 


1 + e 


' 9a — 2 — e 


3 a 


+ i/n). 


The maximum of n°( y) when a belongs to the interval ]J, 1] obtains for: 


where 


Therefore: 


where 

For 


a = (1/3) + (e/6) + 0,(1//V) 


lim 0,(1 /N) = 0. 


VaeJi I] and VyeY 


u a (y) 

u“(x,(a)) 


< (1 ~ 3e)(l + e) 

"" 0 + («/2))*(l - 0 s 


+ 0,(1 IN) 


lim 0,(1/W) = 0. 

N->co 


(1 - 3e)(l + e) 

(1 + (e/2))M - «) 3 


« e ]0,1/41, 


1. 



THE CORE OF ECONOMIES 


467 


Hence, there exists > Ofo < J) such that: 

Vc 6 ]0, ..t 

3A^(«) such that: 

If N > N^c) the allocation x cannot be improved upon by a 
coalition of composition 4jUc,u aA 0 . 

A similar property can be proved for each of the four other possible 
compositions. Therefore the desired result follows. 

4. A Counterexample to Conjecture 2 

We consider an economy E with three goods. The set A of agents can 
be divided into three disjoint subsets 

A = B u C u D, 

with 

fi(B) = ytO = fi(D) = 1. 

The agents have the following characteristics: 

Vae5u C, o>(a) = (f, 0), 

Va e B, u°(x) — x t + x a + xJN, 

Va e C, u a (x) = x t + x 9 + x a /N, 

' Va e D, co(a) = (0, 0, 3), 

Va e D, «°(x) = ( Xl + 1 /ATX*. + 1 W(*s + 1/JV) 

where A is an integer greater than 4. Assumptions HI to H6 are fulfilled. 

The subset B (resp. C) is divided into two disjoint atoms B t and B % 
(resp. C,i and C 2 ) of the same kind. Coalition D is atomless. There are 
two syndicates and S t of the same kind with the following compositions: 

S, — Bi u Cj. 

Sj = B^ u Q. 

The set A a of agents which are not atoms and do not belong to some syndi¬ 
cate is equal to coalition D. 


641/13/3-9 



468 


CHAMPSAUR AND LAROQUE 


Consider the following allocation x. 


Vae B lt 

x,(a) 

II 

3 

+ 2e, 2 + 

«). 

Vo e B t , 

x,(a) 

II 

3 

p 

0), 


'iaeCy, 

x.(a) 

II 

P 

p 

0), 


Va e C a , 

x,(a) 

= (4 + 

2e, 0, 2 -f- 

«), 

VasA 

x.(a) 

= (1 - 

e, 1 — e, 1 

— 


where e is a small positive real number, say 0 < e < Let p be the price 
vector defined by: 

P = 0 , 1 , 1 ). 

The allocation x, is /^-efficient and there exists no other price vector p' 
nonproportional to p such that x, is //-efficient. 

We have to show the existence of values for e and N such that the allo¬ 
cation ic, cannot be improved upon by any coalition that is compatible 
with the syndicates and S t . It is clear that a coalition T compatible 
with the existing syndicates must, in order to be able to improve upon 
3c,, contain only one syndicate. Because of the symmetry of jc,, we have 
to show that any coalition T of the following composition 

T = Si <J a D 

cannot improve upon 3c, (where acD is a coalition of measure a, a •< 1, 
with a D C D). The initial resources of T are equal to 

j^codp = (|, f, 3at). 

If T can improve upon x,, we have Jr p • x, dp. < J r p ■ w dp., which 
implies a > £. By definition, in order to improve upon x, , T must be 
able, with its own initial resources, to obtain utility levels equal at least to 

6 + 3e for the agents as B 1 , 

and 


u = (1 — « + (1 IN))* for the agents a e <xD. 

If If is large enough (for example N > 6/e), such a utility level implies 
that the agents of B l receive quantities of goods 2 and 3 with a sum at least 



THE CORE OF ECONOMIES 


469 


equal to 3 + c. Therefore the average consumption of the members of <xD 
is at best equal to 

(3/2) + 3a — 3 — e (3/2) + 3a - 3 - « 

(3/2a), -2a-’ - 2 a . -• 

For aeJJ, 1], the utility level u corresponding to this consumption is at 
most equal to 

u < (27/32X0 - 2e/3) a + 0,(1 /N), 

where 

lim 0,(1/AO = 0. 

Therefore it is clear that, for small enough values of e (it can be verified 
that e < ^ will do) and for large enough values of N, we have 

u < u. 

The coalition T cannot improve upon the allocation x, in such a case. 


5. Analogies and Differences between Atoms and Syndicates 

Why do results like Theorems B and C not hold if the atoms are replaced 
by syndicates? What assumption, which must be fulfilled by each atom, 
is not satisfied by the syndicates considered in the above counterexample ? 

Indeed our definition of a syndicate S and of the df-core implies that a 
syndicate S is endowed with a preference relation on the average 
consumption of its members which is defined as follows: 

V(x, x') e R/ x R/, x >.s x' 

iff for all 3c': S -* R/ p.-integrable such that 

I x‘ dp = x', 

there exists 5c: 5 —► R./ /x-integrable such that 

) 3c dp = x. 


and 


x(o)> a x'(a)n 


a.e. in S. 



470 


CHAMPSAUR AND LAROQUE 


It is easy to verify that assumptions H2 to H5 are satisfied by 
and p~ St in our counterexample. Also and >» Si derive in an obvious 
manner from preference or indifference relations > s and > Sf on R/ 
which are preorders (reflexive and transitive binary relations) but which 
are not complete. Therefore among the properties of an atom (assumptions 
H2 to H6) the only one which a syndicate fails to have is completeness of 
its preference or indifference relations. 

Note that if all the members of a syndicate S have the same preferences, 
which can be represented by a homogeneous utility function, then, as is 
well known, the preference relation can also be represented by the 
above homogeneous utility function and, therefore, is a complete preorder. 
Indeed there exists another way to get results similar to Theorems B 
and C with syndicates provided we restrict the set of considered allocations. 
It was already the approach followed by Dreze and Gabszewicz [3] since 
they considered only .^-measurable allocations. 

Definition 5. An allocation x is nondiscriminatory with respect to 
the given Syndicate Structure, if for all pairs of syndicates S x and S z of 
the same type, there exists a measurable one-to-one mapping < f> from 5! 
onto S 2 such that one of the following coalitions has a measure equal to 
zero: 


{a e Sj | x(a) >„ x(<f>(a ))). 

{a e 1 x(4>(a)) > 0 x(a)}, 

and where the mapping <f> is such that: 

p a.e. in S t , a and are of the same type 
and 

3a e R, a>0 with V Tea, TCS t , p{f>~ l (T)) —■ <xp(T). 

Note that a ^-measurable allocation is evidently nondiscriminatory and 
that the allocation x r in the counterexample to Conjecture 2 is discrimi¬ 
natory. 

Using either a proof similar to Shitovitz’s or the same approach as in 
[2] it is possible to prove the following generalization of Theorem B. 

Theorem B'. Under assumptions HI to H6 (assumption H6 being 
fulfilled not only by the atoms but also by all the members of the syndicates) 
iffihere are at least two syndicates, if they all are of the same type and if 
etil the atoms belong to the syndicates (A e C A 0 ), then for each allocation 



THE CORE OF ECONOMIES 


471 


x belonging to the 31-core that is nondiscriminatory with respect to the 
existing Syndicate Structure, there exists a price vector p such that : 

f 

(i) x is a p-efficient allocation, 

(ii) p • oj(a) = p • x(a) a.e. in A B . 

Theorem C could also be generalized in the same way. 


References 

1. R. J. Aumann, Disadvantageous monopolies, J. Econ. Theory 6 (1973), 1-11. 

2. P. Champsaur and G. Laroque, The equivalence between the core and the set of 
competitive equilibria: a new approach, I.N.S.E.E., Paris, 1973. 

3. J. Dkhe and J. Jasjcold-Gabszewicz, Syndicates of traders in an exchange 
economy, in “Differential MM and Related Topics,” (H. W. Kuhn and G. P. Szego, 
Eds.), pp. 399-414. North-Holland, Amsterdam, 1971. 

4. B. Shitovitz, On some problems arising in markets with some large traders and 
a continuum of small traders, J. Econ. Theory 8 (1974), 458-470. 

Received: January 24, 1975; revised: November 21, 1975 

Paul Champsaur and Guy Laroque* 
Institut National de la Statistique 
et des Etudes Economiques, Paris 


* The authors gratefully acknowledge the helpful suggestions and comments of 
Professors Robert Aumann and Zvi Artstein. 



JOURNAL OF ECONOMIC THEORY 13, 472-477 (1976) 


The Welfare Implications of 
Becker’s Discrimination Coefficient* 


This note examines critically two welfare theorems proved by Gary Becker in 
his "Economics of Discrimination." Becker demonstrated that discrimination 
diminishes the welfare of a discriminating group. He also derived a necessary 
condition for a group which is the victim of discrimination to be made relatively 
worse off. These results are shown to depend critically on the assumption that 
disciminators incur psychic costs which nondiscriminators do not. When the 
same market equilibrium results from psychic benefits, “discriminators” are 
shown to gain from having a taste for favoritism and the group discriminated 
against is shown always to be made worse off. 


In his seminal work on the economics of discrimination, Gary Becker 
derived the result that discrimination diminishes the welfare of both the 
discriminating group and the group discriminated against. He also 
demonstrated that a group which is both a numeric and an economic 
minority is made relatively worse off when it is the victim of discrimi¬ 
nation. 1 In this note I show that these results depend on a definition of 
discrimination that has those who prefer nonpecuniary to pecuniary reward 
worse off at all combinations of prices. In effect the argument is circular 
and discriminators are assumed to be worse off because they have a taste 
for discrimination rather than a taste for favoritism. 

Becker's market discrimination coefficient for the case of perfect 
substitutes in a competitive amrket is defined as the percentage equilibrium 
price differential. 2 In terms of Becker’s two sector trade model W 'capital 
(K&) discriminates against N labor ( L N ) by exporting less capital E * 
to the N sector than is required to equal the marginal products of capital 
in each sector. Where Iff capital and N capital are perfect substitutes, each 
receives its marginal product, and W capital has a taste for discrimination 
against N labor, the market discrimination coefficient is 


* An earlier draft of this paper was presented before the Workshop on the Economic 
Behavior of Households, The University of Wisconsin. Comments by Glen G. Cain 
and Mgrtin C. David and other members of the Workshop are gratefully acknow- 
author also thanks an anonymous referee for his comments. Any remaining 
rare, of course, the author’s sole responsibility. 

“pp. 32-38], 
p. 17]. 

472 



Copyright C' 1976 by Academic Pre*a, Inc. 

All rights of reproduction in any form reserved. 



BECKER’S DISCRIMINATION COEFFICIENT 


473 


d = ^fj** i <*>0. (1) 

r 

The resulting relation between the marginal products of capital in each 
sector is v 

f* N = f* w (\ +d). O') 

where the production functions for the two sectors are 

f w (L w , R w ) — output in sector W, R w = K w — E K , 

f N {L s , £ v ) = output in sector N, £ N = K N + E K , 

ft, fit > 0, fht Jk > 0, fU ,fk < 0 for i = W, N. 

The stock of capital in sector N is augmented by the capital exports from 
sector W, while the capital stock in sector W is diminished by the loss of 
the exported capital. Thus, the available capital in sector W is — 
K w — E k and the available capital in sector N is = K N + E K . 

As implied by Eq. (!'), W capital requires a higher price in sector N 
than in sector W to compensate for the nonpecuniary costs associated 
with hiring N labor. In deriving his welfare propositions, Becker assumes 
that the quantity (f K w d) represents the absolute nonpecuniary cost borne 
per unit of W capital when it works with N labor. This assumption implies 
that when factors receive their marginal products, “net income” (adjusted 
for nonpecunairy cost) received by W is 

Y{W) = f L "L„ + f K "{K w - E k ) + (/*" - f K w d) E k . (2) 

Substituting (1) we get 

Y(W) ^f u w L w +f K w K w . (3) 

Defined in this way, “net income” of W does not include all of the 
earnings from capital exports. The W capital exported to N is valued at 
f K w per unit rather than f K N . The intersector differential is excluded on the 
grounds that the higher pecuniary return in the N sector is offset by 
nonpecuniary costs. Analogously, net income for N (which does not 
discriminate) is defined as 

Y(N) =.f L N L N +h N K„. (4) 

■A TV 
-14. 

Since N does not discriminate, “net income” is real income and no 
adjustment for nonpecuniary benefits is required. 

The total effect of discrimination on net income of W or N can be 
expressed by the total derivative of net income Y(i) (i = N, W) with 



474 


RICHARD S. TOIKKA 


respect to the discrimination coefficient. With total stocks of labor and 
capital fixed, the only effects of changes in discrimination occur through 
changes in the quantity of capital exported by W. Thus, the total derivative 
is 

dY(0 _ 8 W) dE K 

d{d) ~ 8E K d{d)' W 


From differentiation of (1) or (1'), one can show that when the equilibrium 
condition holds, 

dE K _ h w ^ 0 

did) f» K + (1 + d)f& " W 

since /** ,f% K < 0,f K w > 0, d > 0. Thus, discrimination reduces capital 
exports. It follows that 



(7) 


For/*' and f N homogeneous of degree 1 Becker shows that 8 


dYiW) 

ZEk 

BYiN) 

8E K 


>0, 

(8) 

> 0. 

(9) 


These results lie at the heart of his assertion that discrimination diminishes 
the welfare of both the discriminating group and the group discriminated 
against. Since discrimination leads to less capital exported, (8) and (9) 
imply that discrimination diminishes the net incomes of W and N. 

His proof of the effect of discrimination on the relative welfare.of the 
groups follows similar lines. If we define 


n Y(N) 
YiW) 

one can show that 4 


( 10 ) 


8R 

dE K 


5 0 


as 


J VO ^Lw 

nw )• 


no 


This result serves as the proof of Becker’s assertion that a group which is 
both an economic minority (e.g., Y(N)> Y(W) < 1) and a numeric minority 


t. II, pp. 33 - 34 ], 
f. [I, pp. 35 - 56 ], 



BECKER'S DISCRIMINATION COEFFICIENT 


475 


(e.g., L w \L n > 1) is made relatively worse off by being discriminated 
against. 

Both of these proofs depend critically on the assumption that those who 
have a taste for discrimination are worse off than those who do not 
independently of my behavior. When the discrimination coefficient is used 
as a measure of absolute rather than relative unit cost, the discriminating 
group is assumed to be worse off than a nondiscriminating group or a 
group showing favoritism because the discriminators incur psychic “costs” 
which nondiscriminators do not. To make this point more dramatically, 
we will show that a taste for favoritism toward W labor by W capital leads 
to the identical equilibrium as a taste for discrimination against N labor, 
but has very different welfare implications. If W capital receives a per unit 
nonpecuniary benefit of f K w h from working with W labor, this creates an 
equilibrium in which 


//=//(l+S), S>0, (12) 

where 8 is a “favoritism” coefficient. 6 Equations (T) and (12) describe 
the same equilibrium if 8 — d. There is no way of establishing from 
equilibrium relative prices whether the price differential is caused by 
psychic costs of employing N labor or psychic benefits of employing W 
labor. 

Taking/j^S as unit nonpecuniary benefit derived by IP capital in sector 
W, the net income of W is 

?(W) = f L " L w + (f K w + h w *)(Kw - E k ) + f K N E K . (13) 

Substituting (12) gives us 


Y(W) — ft?L w +f K N K w . (14) 


We will now show that 8Y(W)ldE K < 0, or that favoritism increases W 
welfare: 


8Y(W) 

8E k 


— fLEr^W + fxErKw 


— —fuiLw .(kkKw < 0 . 


(15) 


1 Becker suggests that favoritism can be introduced by using a negative discrimi¬ 
nation coefficient. The implied negative discrimination coefficient in this case is 
J = —8/(1 + 8) leading to f K w = f K N ( 1 + J) which is equivalent to (12). 



476 


RICHARD S. TOIKKA 


Since /£*■ > 0, f% K < 0. When preferences are evaluated in terms of 
pecuniary gain from employing W labor, “discrimination” by W capital 
increases W welfare. 

A similar reversal occurs with respect to Becker’s proof that a group 
which is both an economic and a numeric minority is made worse off by 
discrimination. By substituting the definition of W net income (14) into 
(10), we get a new relative net income variable R: 


R = ^, 

F(tV) 

SR = P( WXP r(N)/8£ K ) - Y(N)(b F( W)I6E k ) 

8E k [W)P ' 


(16) 


since 


8Y(N) , 
8£ k " 


by (9), 


87(fV) 

d£r 


0 by (15). 


Thus, when taste for favoritism produces the flow of W capital into the 
W sector, N is always made relatively worse off. 

Actually, it is not necessary to associate absolute benefits or costs with 
discriminating behavior. Clearly a preference for A implies a preference 
away from “not A.” Without more information about subjective states 
of consciousness, it is not possible to say whether a preference for “not A" 
places one on a higher utility plane than a preference for A. Not making 
such judgments leaves us with much weaker welfare theorems, however. 
If we take the discrimination coefficient as measuring relative costs or 
benefits and do not use it as an index of absolute pleasure or pain, then 
we can conclude nothing about how discrimination or favoritism affects 
the welfare of the group discriminating or showing favoritism. We may 
conclude merely that they are maximizing utility and that the discrimi¬ 
nation coefficient indicates how much discrimination is optimal from 
their private point of view. We can be more definite with respect to the 
effect of the discrimination on groups discriminated against, however. 
They always lose (since Becker’s proof that SY(N)I8E K > 0 is unaffected 
by whether W receives psychic costs or psychic benefits). 

This agnosticism with respect to the effect of discrimination on the 
discriminator’s welfare is actually much more in keeping with the spirit of 
Becker’s analysis than are the welfare propositions which he proves. His 
stated objective in writing the book was to place the theory of discrimi¬ 
nation on a.more objective footing. As he wrote: 

( In the sodopsychological literature on this subject one individual is said to 
if discriminate against (or in favor of) another if his behavior toward the latter is 



BECKER’S DISCRIMINATION COEFFICIENT 


477 


not motivated by an “objective” consideration of fact. It is difficult to use 
this definition in distinguishing a violation of objective facts from an expression 
of tastes or values. For example, discrimination and prejudice are not usually 
said to occur when someone prefers looking at a glamorous Hollywood actress 
rather than at some other woman; yet they are said to occur when he prefers 
living next to whites rather than next to Negroes. At best calling just one of 
these actions “discrimination” requires making subtle and rather secondary 
distinctions. Fortunately, it is not necessary to get involved in these more 
philosophical issues. It is possible to give an unambiguous definition of dis¬ 
crimination in the market place and yet get at the essence of what is usually 
called discrimination.' 

To remain true to the spirit of a value-free definition of discrimination 

requires giving up the two strong welfare theorems discussed in this note. 


References 

1. G. S. Becker, “The Economics of Discrimination,” 2nd ed„ Univ. of Chicago Press, 
Chicago, 1971. 

2. A. O. Krueger, The economics of discrimination, J. Polit. Econ. 71 (1963), 481-486. 

3. R. Marshall, The economics of racial discrimination: A survey, J. Econ. Lit. 12 
(1974), 849-871. 

Received: August 21, 1975; revised: April 19, 1976 

Richard S. Toikka 
The Urban Institute 
Washington, D.C. 20037 


Ref. [1, p. 13]. 



JOURNAL OF ECONOMIC THEORY 13, 478-483 (1976) 


A Household Consumption Model of Solid Waste 


Households, as producers of waste, outweigh industry by a ratio of at 
least three to one. 1 Caimcross has termed the household a “small factory” 
which buys inputs and capital goods, works with subcontractors, and even 
pays incentive wages (e.g., to the children), in order to produce utility. 
Becker has carried this concept to its logical conclusion with a model 
which uses commodities and time as inputs, and utility as an output. 4 
Waste generation is directly related to time-commodity substitution. 
To Becker’s model can be added the concept of joint production. 

Households engage in a number of activities, Z ( , in order to maximize 
their utilities. These activities involve an investment of both market 
commodities and time. Therefore, building on Lancaster’s model [8]: 

Z ( =f(X iS ,T iS ) (1) 

where X H is a vector of j market commodities, and T 1} is a vector of j time 
inputs used in the ith activity. 3 Utility may be defined as a function of the 
activities 

U = t/(Z x ,..., Z m ) = t/(/i ) - U( Xl x n ; 7\ T n ) (2) 

subject to constraints on goods and time: 


Z Z Pi x a < 3a ) 

1 M 

£ Z Tn =■ T - T v . (3b) 

j-l i-1 

Pi is a vector of unit prices of x } , T w is the time spent working at wages w 
per unit time, T„ is the total time spent on consumption, and T is the total 
time available. 

1 [9], It is questionable whether the market can efficiently choose between aggregate 
and disaggregate use [13], 

* McGauhey [5] has suggested the following rule of thumb for manufacturers: 
"Don’t make it without first looking down the road, beyond the consumer, to its final 
resting place in the environment." And to both consumers and producers: “Don’t 
throw it away as long as it has resource value [10]” Proceedings of the First National 
Ifonfe^nce on Packaging Waste (September 1969) Davis: University of California. 

• Lancaster, [8], Chapters 6-8. 


Copyright (0 1976 by Academic Press, Inc. 

All rights of reproduction in any form reserved. 


478 



CONSUMPTION MODEL OF SOLID WASTE 479 

The production function (1) can be reexpressed 

Tn — tnZi (4a) 

x u — bi}Z ( , (4b) 

where t i} is a vector of; time inputs per unit of Z, and b (j is a similar vector 
of j commodity inputs. The constraints (3) can be combined with the 
functions in (4) to produce a single constraint 


Z Z + '«*) Z< = Tw. (5) 

j-i *-i 

This equation states simply that time has a money value equal to wages 
paid per hour. That is, £ 2 is the total money value of all com¬ 

modities going into activities and Tw equals the realizable income if 
T e = 0. Since £ £ w(t {) Z t ) = Tw — £ £ ( pib {i Z t ), w represents a price 
placed on each unit of time used for consumption activities. 

The use of w implies that average wages equal marginal wages. While 
this is an unrealistic assumption, use of marginal wages does not add 
significantly to the presentation. £ £ w(t i} Z ( ) and Tw simply become 
implicit functions, w will suffice for our purposes. 

If we maximize (2) subject to (5), we get 

U' - U(Zy ,..., ZJ - p [tw - z z (p< b <i + z \ 


with 


dU'jdZ, = (x Z Z (P> b « + O’ = 1 . m 'J = !»-, «). 

where /j. is the marginal utility of money income. This emphasizes the 
money effect of time on utility. When a consumer is maximizing utility, 
his choice among activities is wage sensitive. When his income rises, with 
prices constant, he moves from time consumptive activitieito commodity 
consumptive activities. This can be illustrated graphically. 

In Fig. 1, ti , the time-per-unit Z t , occupies the abscissa and b { , the 
commodities per unit Z, , the ordinate. U is an indifference curve. Lines 
if i and w 2 represent budget lines indicating the relative value of t { and b t 
for the wage rates w t > iv s with given commodity prices. The household 
chooses A , the tangency point between U and ivj, for w x . If iv rises to w g , 
the household now chooses point B. The change from A to B involves a 
reduction in t 1 from a. to /S, and an increase in b f from y to k. An illustration 



480 


CONWAY L. LACKMAN 



is the increasing use of packaging (e.g., TV dinners and nonreturnable 
bottles) in developed nations, where real income is higher than in under¬ 
developed nations. 4 

Clearly, unconsumed packaging and disposable bottles are material 
goods produced by the household in its pursuit of maximum utility. 
Theoretically, the household should be included in-the product model, but 
a difficulty exists because the household maximizes utility rather than 
profit. An accommodation can be reached only by use of a simplifying 
assumption; e.g., that the marginal utility of money income is constant 
(as denoted by /*). 

This assumption is not applicable to the entire range of household 
income; but, as Boulding [3] and Ciriacy-Wantrup [4] have indicated, 
neither can the entire range of a firm’s production levels be discussed under 
joint production.® Hicks [6] has noted that marginal utility of money 
income does not change significantly.® 

With these observations in mind, we can substitute (4) for (5)_to find 
our major constraint: 


L {Pi*i + tjw) = Tw. 

j-i .... 

Similarly, our utility function will be ■ 1 ' 


U — U{xix„ ; t n ). 


‘“The Waste-High Crisis,” Modern Packaging 4/(11), 102 (November 1968). 

.? Boulding [3] states that average costs cannot be determined under joint production. 
<Sriacy-Wantrup [4] agrees, saying that while in theory average cost could be found 
^■ integrating the marginal costs over the range of production and dividing, this 
Hitferation is not practicable. 

;P* See Hides, [6]. 


CONSUMPTION MODEL OF SOLID WASTE 


481 


Since we cannot produce time, but only use it as an input, t, < 0 
(J = 1,..., »), and T < 0. Maximizing, we have ,• 

U" = {/(*!** ; t„) — n [7w — £ (ppc, + f,w)]; 

and 

SU’/dx, = ixp, - (dU/d Xj ) (j = 1 n) 

eu’/dt, = -ftw - (8U/8t,) U = 1 n), 


or 

/></*' = -dti/dXi , 

for any good, X ( , and the time associated with the use of that good. 
These results are essentially the same as those of the firm’s production 
function. 

In the case of household consumption, let us now assume four goods: 
X,, X t , X„,, and X E (material input, time-input—valued as man labor 
hours or wages, and external effect). A fifth good, less tangible, involves 
utility through the use of X,. We will call this pX v , where (jl is the marginal 
utility of money. If waste has a negative value, we have the following 
relation: 


fjiMRu = ~MR W - MR e - MR, - MR, . (6) 

— MR,, we remember, is the wage rate. Once again, we must understand 
that — MR W is the cost imputed from the inputs that are used to handle 
waste. If Xt and X, are these inputs, then we can quickly realize that more 
of X, will be used as income, hence consumption rises. If X, has a 
propensity for causing waste, we can expect more waste to be generated. 

If a third input, labor, is included, it has a similar effect on X ,. As the 
cost of domestic service rises, for example, more of X, is used, causing 
more waste. We would expect the very wealthy to use X, inputs of sufficient 
value to account for the use of labor instead. Hence, an overall effect of 
income distribution on waste may be ascertained. Low-income groups, 
whose time is worth less, will prefer time-inputs to arterial-inputs. 
Similarly, high-income groups will prefer labor-inputs to tke use of their 
material-inputs. It is the wide middle that prefers material-inputs and 
generates the most waste. Hence, as income is equilized, more waste per 
capita will be generated. 

This supposes that the consumer is not paying external costs. Buchanan 
and Stubblebine’s conditions imply that the consumer’s utility must equal 
all the internal costs plus the external cost imposed. This means that he can 
afford to increase his costs as long as this marginal increase results in a 



482 


CONWAY L. LACKMAN 


decrease of marginal externalities by an equal amount. There are two 
efficient ways in which to accomplish this. One is to select goods in 
accordance with their propensity to cause externalities. This will increase 
the demand for low-waste propensity goods and decrease the demand for 
high-waste propensity goods. The second way is to spend more on the 
handling of waste. If both methods are simultaneously utilized, that is, 
if there is an accounting of externalities, goods requiring higher time-inputs 
will become more popular with middle-income groups than they are now. 

If waste can be recycled, it becomes an asset. Not only does recycling 
decrease the cost of waste, but it decreases externalities. Only aesthetic 
external costs remain, which can be associated with the handling of any 
good but, of course, require accounting. The significance is that recycling 
eliminates the concept of waste, leaving an input-output flow in its place. 

A problem still remains between consumers and firms. Firms will raise 
prices of goods produced with wastes to account for the added costs. 
Part of these costs result from externalities occurring between the firm 
and the community. An increase in price effectively decreases purchasing 
power. Externalities should only be reduced to the point where their 
marginal cost reduces utility by as much as the marginal decrease in 
purchasing power which results from their correction. This problem is 
just as important where an input to a waste producing process feels the 
indirect effect of the increase in price. Once again, relevant costs must be 
weighed to assure efficiency. 


Conclusion 

This paper is an exposition of Lancaster’s model [8] of consumption 
activities, with implications for solid waste. The main conclusion is that 
in a Lancastrian model with no income effects, higher incomes lead to a 
substitution away from time intensive activities. Higher income consumers 
will tend to use disposable products rather than returnables. This 
conclusion assumes consumers do not pay external costs. The consumer 
can afford to increase his costs as long as this marginal increase results in 
a decrease of marginal externalities by an equal amount, either by selecting 
goods according to their propensity to cause externalities or, by spending 
more on the handling of waste. With recycling, both the cost of waste and 
its externalities decrease. Since recycling adds costs (including externalities 
between producers and the community) which raise price and reduce 
purchasing power, externalities should only be reduced to where their 
marginal cost reduces utility by as much as the marginal decrease in 
purchasing power which results from their correction. 



CONSUMPTION MODEL OF SOLID WASTE 


483 


A means of approaching joint production has been considered, and a 
framework of marginal conditions for looking at consumption identified. 
Only internal costs, however, have been incorporated. It remains for the 
externalities of social costs associated with solid waste management to 
demonstrate the interrelation of all parts of the community and economy 
in respect to solid waste. A system of social accounting must, finally, be 
included in the joint production model, which exceeds the scope of this 
analysis. 


References 

1. Aerojet-General Corp. and Engineering Science, Inc., A systems study of solid 
waste management in the Fresno area, U. S. Bureau of Solid Waste Management, 
1969. 

2. ft. J. Barnett, Pressures of growth upon environment, in “Environmental Quality 
in a Growing Economy,” (H. Jarrett, Ed.), Johns Hopkins Press, Baltimore, 1966. 

3. K. E. Boulding, “Economic Analysis,” (1st Edition), Harper Brothers, New York, 
1941. 

4. S. V. Cirlacy-Wantrup, Economics of joint costs in agriculture. Journal of Farm 
Economics 23(4), 77-818. 

5. C. G. Golueke and McGauhey, Comprehensive Studies of Solid Waste Manage¬ 
ment, National Research Council, 1970. 

6. J. R. Hicks, The rehabilitation of consumer surplus. Review of Economics and 
Statistics 9, 108-116. 

7. K. W. Kapp, “Social Cost of Business Enterprise,” (2nd Edition), Asia Publishing 
House, Bombay, 1963. 

8. K. Lancaster, “Consumer Demand”, Columbia Univ. Press, New York, 1971. 

9. H. H. Landsberg, The U.S. resource outlook: quality and quantity, Daedalus 96(4): 
1034, Fall 1967. 

10. P. H. McGauhey, Develop strategies for packaging waste management. Proceedings 
of the First National Conference on Packaging Waste, September 1969. University 
of California, Davis. 

11. P. H. McGauhey, Processing, converting, and utilizing solid waste. Proceedings 
of the National Council on Solid Waste Research, 1969. 

12. J. R. Sheaffer, Metropolitan problems in refuse disposal, Proceedings of the 
National Conference on Solid Waste Research, December 1963. 

13. W. A. Vogeley, The economic factors of mineral waste utilization, Proceedings 
of the Symposium of Mineral Waste Utilization, 1968. 

Received: September 4, 1975 

Conway L. Lackman 
Department of Economics 
Rutgers University 
New Brunswick, New Jersey 08903 


643/13/3-10 



JOURNAL OF ECONOMIC THBOAY 13, 484-48$ (1976) 


Technological Condition for Balanced Growth: 
A Note on Professor Whitaker’s Contribution 


After the publication of my joint paper [1], Professor Whitaker of 
Virginia University kindly brought to my attention his earlier contribution 
[2] which deals with the same problem and which in fact goes a long way 
towards the correct identification of the class of disembodied technical 
changes that are consistent with balanced growth of a one-sector, constant 
retums-to-scale economy. 

In particular, Professor Whitaker shows by example that if, as is usual, 
the constancy of relative (imputed) factor shares is not made a part of the 
definition of balanced growth, then capital and output can grow at a 
common constant rate, and labor at another constant rate, even though 
he technical changes present are of the capital-using type in the Harrod 

se. 

ofessor Whitaker also noted that the class of production functions 
Y — G(K, L; t), defined implicitly by the relationship 

Y/K — g((L[K)e MK/ru ) (1) 

does not “generally represent a technology experiencing Harrod-neutral 
technical progress” [2, p. 1059], although it is capable of generating 
balanced growth paths. 

However, Professor Whitaker did not discuss the general properties of 
the class of production functions (1). In particular, he did not obtain the 
conditions which should be imposed on tx — ot(K/Y ) in ordfcr that 
G(K, L;t) remain an ordinary neoclassical production function, with the 
marginal products positive and nonincreasing. He also falsely concluded 
that while in his example labor’s (imputed) share in output is monotonically 
decreasing to zero when time goes to infinity, yet “relative examples in 
which the changing labor share converges on a positive asymptote, 
rather than on the zero asymptote, can readily be devised” [2, p. 1058]. 
In fact, however, they cannot. 

To see this, let us differentiate Eq. (1) with respect to L, solve for 8Y/8L 
and multiply by L/Y to obtain labor’s share: 

Vl “II WY) r) L <x’t)(K/Y) xg' 

_ x? _ 

{Y/K) + (KIY)xg'ac't 

484 

Copyright © 1976 by Academic Presa, Inc. 

AD rijfht* of reproduction in any form reserved. 



BALANCED GROWTH 


485 


where x = (L/K) exp a(KIY)t is the argument of the function g in Eq. (1), 
and g' = dgjdx r, a' = da./d(K/Y). Since under balanced growth Y/K, x, 
a' and g' are all constant over time, it follows from Eq. (2) that rj L is either 
constant when a' = 0, i.e., when technical progress is Harrod-neutral, 
or it tends gradually to zero when a' > 0, i.e., when technical progress is 
capital-using in the Harrod sense. 

Nevertheless, I feel that note should be taken of Professor Whitaker’s 
earlier contribution. 


References 

1. A. Chilosi and S. Gomulka, Technological Condition for Balanced Growth: 
A Criticism and Restatement, J. Econ. Theory 9 (1974), 171-184. 

2. J. K. Whitaker, Harrod-neutral technical progress and the possibility of steady 
growth, Rivista Internazionale di Scienze Economiehe e Commercial1 11 (1970). 

Received: November 11, 1975; revised: April 12, 1976 

STANISLAW GOMULI^-.i 
London School of Econ&tfUfs 
London, England 


* I am grateful to an anonymous referee for a technical suggestion. 



JOURNAL OF ECONOMIC THEORY 13, 486-487 (1976) 


On the History of Concepts of Fairness 


The recent article of T. Daniel [1] attributes some discoveries to me 
which in fact I did not make. All references in my original paper [4] were 
relegated to footnotes, Although I thought that this improved the 
readibility, it apparently led to some confusion involving the attributions. 
I welcome this opportunity to set the record straight. In particular, 

(1) It was David Gale who first noticed the flaw in Schmeidler and 
Yaari’s original proof. When I heard of this problem, I constructed the 
counterexample given in Fig. 1 in my paper and tried to find a general¬ 
ization of the fairness concept that would exist under more general 
hypotheses. The contribution of the pioneering work of Schmeidler and 
Yaari is fully acknowledged in footnotes 4 and 7 in my original paper. 

(2) The fact that fair allocations might not exist in the production 
case is due to Pazner and Schmeidler [2], The example given in my paper 
is due to them, as I explicitly stated in footnote 8. Their argument was 
repeated as Example 3.1 in the Appendix to my paper since their paper 
was at the time unpublished. 

(3) The concept of income-fairness is also due to Pazner and 
Schmeidler [3] although they do not refer to it by that name. In my paper 
I outlined a brief proof of existence and referred the reader, to their 
original paper in footnote 9. Pazner and Schmeidler are also given full 
credit for this concept in [5]. 


References 


1. T. Daniel, A revised concept of distributional equity, J. Econ. Theory II (1975), 
94-109. 

2. E. Pazner and D. Schmeidler, A difficulty in the concept of fairness. Rev. Econ. 
Studies 41 (1974), 441-443. 

3. E. Pazner and D. Schmeidler, Decentralization, income distribution and the role 
of money in socialist economies. Technical Report No. 8, The Foerder Institute 
for Economic Research, Tel-Aviv University, 1972; Economic Inquiry, to appear. 

486 

Copyright © 1976 by Academic Pre*», Inc. 

AU rights of reproduction in any form reserved. 



ON THE HISTORY OF CONCEPTS OF FAIRNESS 


487 


4. H. Varian, Equity, envy, and efficiency, J. Econ. Theory 9 (1974), 63-91. 

5. H. Varian, Distributive justice, welfare economics and the theory of fairness, 
Philosophy and Public Affairs 4 (1975), 223-247. 


Received: April 4,1976; revised: June 7,1976 


Hal R. Varian 

Department of Economics 
Massachusetts Institute of Technology 
Cambridge, Massachusetts 02139 



JOURNAL OF ECONOMIC THEORY 13, 488 (1976) 


Announcement and Call for Papers 

European Meeting of the Econometric Society 

Vienna, 1977 


The next European Meeting of the Econometric Society will be held in 
Vienna, Austria, from September 6 to September 9, 1977. 

Proposals for contributed papers to be presented at the Meeting are 
being actively solicited by the Programme Committee. Those wishing to 
present papers should submit proposals directly to the corresponding 
Programme Chairmen: 

Econometrics: Dr. Jean-Francois Richard, CORE, 54 de Croylaan, 3030 
Heverlee, Belgium. 

Economics: Professor Jean-Michel Grandmont, CEPREMAP, 142 rue du 
Chevaleret, 75013 Paris, France. 

Papers should be submitted in duplicate, accompanied by three copies 
of an abstract. Final decisions will be based only on full manuscripts; 
thus, please submit papers at your earliest convenience. No submissions 
can be considered after April 15, 1977. 

Registration forms may be obtained by writing to the Local Organising 
Chairman: Professor G. Schwbdiauer, The Institute of Advanced Studies, 
Stumpergasse 56, A 1060 Vienna, Austria. The Society will provide its 
members with the registration forms. 


Copyright © 1976 by Academic Press, Inc. 

AH rights of reproduction in any form reserved. 


488 



1 ll 


JOURNAL OF ECONOMIC THEORY 13 , 489 ( 1976 ) 


f 

Author Index for Volume 13 


B 

Barro, Robert J., 229 
Blair, Douglas H., 361 
Bordes, Georoes, 361 

C 

Champsaur, Paul, 438 
Constantinides, Gboroe M., 245 

D 

Dierker, E., 428 

E 

El-Safty, Ahmad E., 298, 319 
F 

Fishburn, Peter C., 14 
Fouroeaud, C„ 428 
Fuchs, G£rard, 201 

G 

GardEnfors, Peter, 217 
Gbhrlein, William V., 14 
Glustoff, Errol, 439 
Gomulka, Stanislaw, 484 

H 

Hammond, Peter J., 329 
Hansson, Benot, 193 
Hartoo, Joop, 448 
Hartwick, John, 396 

K 

#; 

Ke8|y, Jerry S„ 138, 361 
Kohlbero, Elon, 1 
Kolm, Seroe-Christophe, 82 


L 

Lackman, Conway L., 478 
Laroque, Guy, 458 
Lesourne, Jacques, 118 

M 

Magill, Michael J. P„ 245, 264 
Majumdar, Mukul, 26, 47 
Manning, R., 380 
Maschler, Michael, 184 
McFadden, Daniel, 26 
Miller, Bruce L., 154 
Mitra, Tapan, 26, 47 

N 

Neuefeind, W., 428 

P 

Pollak, Robert A., 272 
R 

Rader, Trout, 58 
Roberts, John, 112 
Ross, Stephen A., 341 

S 

Sahlquht, Henrik, 193 
Schwartz, Thomas, 414 
Schweizer, Urs, 396 

SONNENSCHBIN, HUGO, 112 
SUZUMURA, KOTARO, 361 
Svensson, Lars E. O., 169 


Taylor, Carol A., 67 
Toikka, Richard S„ 472 

V 

Varaiya, Pravin, 396 
Varian, Hal R„ 486 


Copyright © 1976 by Academic Presa, Inc. 

All rights of reproduction in any form reserved. 


Printed in Belgium 




